hopwise.model.path_language_modeling_recommender.pearlmllama3¶
- Reference:
Balloccu et al. “Faithful Path Language Modeling for Explainable Recommendation over Knowledge Graph.” - preprint.
- Reference code:
https://github.com/Chris1nexus/pearlm https://github.com/rasbt/LLMs-from-scratch/blob/main/ch05/07_gpt_to_llama/converting-llama2-to-llama3.ipynb
Attributes¶
Classes¶
Base class for all neural network modules. |
|
Base class for all neural network modules. |
|
Base class for all neural network modules. |
|
Low-level implementation of PEARLM model based on LLaMA 3 architecture. |
|
Functions¶
|
|
|
Module Contents¶
- hopwise.model.path_language_modeling_recommender.pearlmllama3.TokenType¶
- class hopwise.model.path_language_modeling_recommender.pearlmllama3.AutoregressiveGroupQuerySelfAttention(config)¶
Bases:
torch.nn.Module
Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing them to be nested in a tree structure. You can assign the submodules as regular attributes:
import torch.nn as nn import torch.nn.functional as F class Model(nn.Module): def __init__(self) -> None: super().__init__() self.conv1 = nn.Conv2d(1, 20, 5) self.conv2 = nn.Conv2d(20, 20, 5) def forward(self, x): x = F.relu(self.conv1(x)) return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will also have their parameters converted when you call
to()
, etc.Note
As per the example above, an
__init__()
call to the parent class must be made before assignment on the child.- Variables:
training (bool) – Boolean represents whether this module is in training or evaluation mode.
- num_heads¶
- dropout¶
- head_dim¶
- W_key¶
- W_value¶
- group_size¶
- W_query¶
- out_proj¶
- causal_mask¶
- forward(x)¶
- class hopwise.model.path_language_modeling_recommender.pearlmllama3.FeedForward(config)¶
Bases:
torch.nn.Module
Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing them to be nested in a tree structure. You can assign the submodules as regular attributes:
import torch.nn as nn import torch.nn.functional as F class Model(nn.Module): def __init__(self) -> None: super().__init__() self.conv1 = nn.Conv2d(1, 20, 5) self.conv2 = nn.Conv2d(20, 20, 5) def forward(self, x): x = F.relu(self.conv1(x)) return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will also have their parameters converted when you call
to()
, etc.Note
As per the example above, an
__init__()
call to the parent class must be made before assignment on the child.- Variables:
training (bool) – Boolean represents whether this module is in training or evaluation mode.
- fc1¶
- fc2¶
- fc3¶
- silu¶
- forward(x)¶
- class hopwise.model.path_language_modeling_recommender.pearlmllama3.Block(config)¶
Bases:
torch.nn.Module
Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing them to be nested in a tree structure. You can assign the submodules as regular attributes:
import torch.nn as nn import torch.nn.functional as F class Model(nn.Module): def __init__(self) -> None: super().__init__() self.conv1 = nn.Conv2d(1, 20, 5) self.conv2 = nn.Conv2d(20, 20, 5) def forward(self, x): x = F.relu(self.conv1(x)) return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will also have their parameters converted when you call
to()
, etc.Note
As per the example above, an
__init__()
call to the parent class must be made before assignment on the child.- Variables:
training (bool) – Boolean represents whether this module is in training or evaluation mode.
- rmsnorm1¶
- causal_attn¶
- rmsnorm2¶
- feedforward¶
- forward(x)¶
- class hopwise.model.path_language_modeling_recommender.pearlmllama3.PEARLMLlama3(config, dataset)¶
Bases:
hopwise.model.abstract_recommender.PathLanguageModelingRecommender
Low-level implementation of PEARLM model based on LLaMA 3 architecture.
With 8 kv-groups (that’s how many Llama 3 8B uses), we can see that the number of rows of the key and value matrices are reduced by a factor of 4 (because 32 attention heads divided by 8 kv-groups is 4) To make the GroupedQueryAttention equivalent to standard multi-head attention, you can set the number of query groups equal to the number of heads.
- config¶
- dataset¶
- tokenizer¶
- temperature¶
- wte¶
- wpe¶
- blocks¶
- rmsnorm¶
- lm_head¶
- loss¶
- _init_weights(module)¶
- forward(idx, targets=None)¶
- calculate_loss(interaction)¶
Calculate the training loss for a batch data.
- Parameters:
interaction (Interaction) – Interaction class of the batch.
- Returns:
Training loss, shape: []
- Return type:
torch.Tensor
- predict(interaction)¶
Predict the scores between users and items.
- Parameters:
interaction (Interaction) – Interaction class of the batch.
- Returns:
Predicted scores for given users and items, shape: [batch_size]
- Return type:
torch.Tensor
- generate(**kwargs)¶
Take a conditioning sequence of indices idx (LongTensor of shape (b,t)) and complete the sequence max_new_tokens times, feeding the predictions back into the model each time. Most likely you’ll want to make sure to be in model.eval() mode of operation for this.
- hopwise.model.path_language_modeling_recommender.pearlmllama3.precompute_rope_params(head_dim, theta_base=10000, context_length=4096, freq_config=None, device=None)¶
- hopwise.model.path_language_modeling_recommender.pearlmllama3.compute_rope(x, cos, sin)¶