hopwise.utils¶

Submodules¶

Attributes¶

`general_arguments`
`training_arguments`
`evaluation_arguments`
`dataset_arguments`

Classes¶

`ModelType`	Type of models.
`KGDataLoaderState`	States for Knowledge-based DataLoader.
`EvaluatorType`	Type for evaluation metrics.
`InputType`	Type of Models' input.
`FeatureType`	Type of features.
`FeatureSource`	Source of features.
`PathLanguageModelingTokenType`	Type of tokens in paths for Path Language Modeling.
`GenerationOutputs`	Dataclass to hold the outputs of the generation process.
`WandbLogger`	WandbLogger to log metrics to Weights and Biases.

Functions¶

`init_logger`(config)	A logger that can show a message on standard output and write it into the
`progress_bar`(args, *kwargs)
`set_color`(log, color[, highlight, progress])
`calculate_valid_score`(valid_result[, valid_metric])	Return valid score from valid result
`deep_dict_update`(updated_dict, updating_dict)
`dict2str`(result_dict)	Convert result dict to str
`early_stopping`(value, best, cur_step, max_step[, bigger])	validation-based early stopping
`ensure_dir`(dir_path)	Make sure the directory exists, if it does not exist, create it
`get_environment`(config)
`get_flops`(model, dataset, device, logger, transform[, ...])	Given a model and dataset to the model, compute the per-operator flops
`get_gpu_usage`([device])	Return the reserved memory and total memory of given device in a string.
`get_local_time`()	Get current time
`get_logits_processor`(model_name)
`get_model`(model_name)	Automatically select model class based on model name
`get_sequence_postprocessor`(postprocessor_name)
`get_tensorboard`(logger)	Creates a SummaryWriter of Tensorboard that can log PyTorch models and metrics into a directory for
`get_trainer`(model_type, model_name)	Automatically select trainer class based on model type and model name
`init_seed`(seed, reproducibility)	Init random seed for random functions in numpy, torch, cuda and cudnn
`list_to_latex`(convert_list[, bigger_flag, subset_columns])

Package Contents¶

hopwise.utils.general_arguments = ['gpu_id', 'use_gpu', 'seed', 'reproducibility', 'state', 'data_path', 'checkpoint_dir',...¶

hopwise.utils.training_arguments = ['epochs', 'train_batch_size', 'learner', 'learning_rate', 'train_neg_sample_args', 'eval_step',...¶

hopwise.utils.evaluation_arguments = ['eval_args', 'repeatable', 'metrics', 'topk', 'valid_metric', 'valid_metric_bigger',...¶

hopwise.utils.dataset_arguments = ['field_separator', 'seq_separator', 'USER_ID_FIELD', 'ITEM_ID_FIELD', 'RATING_FIELD',...¶

class hopwise.utils.ModelType¶

Bases: enum.Enum

Type of models.

GENERAL: General Recommendation
SEQUENTIAL: Sequential Recommendation
CONTEXT: Context-aware Recommendation
KNOWLEDGE: Knowledge-based Recommendation
PATH_LANGUAGE_MODELING: Path Language Modeling Recommendation

GENERAL = 1¶

SEQUENTIAL = 2¶

CONTEXT = 3¶

KNOWLEDGE = 4¶

TRADITIONAL = 5¶

DECISIONTREE = 6¶

PATH_LANGUAGE_MODELING = 7¶

class hopwise.utils.KGDataLoaderState¶

Bases: enum.Enum

States for Knowledge-based DataLoader.

RSKG: Return both knowledge graph information and user-item interaction information.
RS: Only return the user-item interaction.
KG: Only return the triplets with negative examples in a knowledge graph.

RSKG = 1¶

RS = 2¶

KG = 3¶

class hopwise.utils.EvaluatorType¶

Bases: enum.Enum

Type for evaluation metrics.

RANKING: Ranking-based metrics like NDCG, Recall, etc.
VALUE: Value-based metrics like AUC, etc.

RANKING = 1¶

VALUE = 2¶

class hopwise.utils.InputType¶

Bases: enum.Enum

Type of Models’ input.

POINTWISE: Point-wise input, like uid, iid, label.
PAIRWISE: Pair-wise input, like uid, pos_iid, neg_iid.
LISTWISE: List-wise input, like uid, [iid1, iid2, ...].
PATHWISE: KG Path-wise input, like uid, pos_iid, eid1, eid2, next_pos_iid.
USERWISE: User-wise input, like uid0, uid1, ...., uidn.

POINTWISE = 1¶

PAIRWISE = 2¶

LISTWISE = 3¶

PATHWISE = 4¶

USERWISE = 5¶

class hopwise.utils.FeatureType¶

Bases: enum.Enum

Type of features.

TOKEN: Token features like user_id and item_id.
FLOAT: Float features like rating and timestamp.
TOKEN_SEQ: Token sequence features like review.
FLOAT_SEQ: Float sequence features like pretrained vector.

TOKEN = 'token'¶

FLOAT = 'float'¶

TOKEN_SEQ = 'token_seq'¶

FLOAT_SEQ = 'float_seq'¶

class hopwise.utils.FeatureSource¶

Bases: enum.Enum

Source of features.

INTERACTION: Features from .inter (other than user_id and item_id).
USER: Features from .user (other than user_id).
ITEM: Features from .item (other than item_id).
USER_ID: user_id feature in inter_feat and user_feat.
ITEM_ID: item_id feature in inter_feat and item_feat.
KG: Features from .kg.
NET: Features from .net.

INTERACTION = 'inter'¶

USER = 'user'¶

ITEM = 'item'¶

USER_ID = 'user_id'¶

ITEM_ID = 'item_id'¶

KG = 'kg'¶

NET = 'net'¶

class hopwise.utils.PathLanguageModelingTokenType(token, token_id)¶

Bases: enum.Enum

Type of tokens in paths for Path Language Modeling.

SPECIAL: Special tokens, like start and end of a path.
ENTITY: Entity tokens.
RELATION: Relation tokens.
USER: User tokens.
ITEM: Item tokens.

SPECIAL = ('S', 0)¶

ENTITY = ('E', 1)¶

RELATION = ('R', 2)¶

USER = ('U', 3)¶

ITEM = ('I', 4)¶

token¶

token_id¶

__str__()¶

hopwise.utils.init_logger(config)¶

A logger that can show a message on standard output and write it into the file named filename simultaneously. All the message that you want to log MUST be str.

Parameters:: config (Config) – An instance object of Config, used to record parameter information.

Example

>>> logger = logging.getLogger(config)
>>> logger.debug(train_state)
>>> logger.info(train_result)

hopwise.utils.progress_bar(*args, **kwargs)¶

hopwise.utils.set_color(log, color, highlight=True, progress=False)¶

class hopwise.utils.GenerationOutputs¶

Bases: dict

Dataclass to hold the outputs of the generation process.

sequences¶

The generated sequences.

Type:: torch.Tensor

scores¶

The scores for each generated token.

Type:: torch.Tensor

sequences: torch.Tensor¶

scores: torch.Tensor¶

__post_init__()¶

hopwise.utils.calculate_valid_score(valid_result, valid_metric=None)¶

Return valid score from valid result

Parameters:

valid_result (dict) – valid result
valid_metric (str, optional) – the selected metric in valid result for valid score

Returns:

valid score

Return type:

float

hopwise.utils.deep_dict_update(updated_dict, updating_dict)¶

hopwise.utils.dict2str(result_dict)¶

Convert result dict to str

Parameters:: result_dict (dict) – result dict
Returns:: result str
Return type:: str

hopwise.utils.early_stopping(value, best, cur_step, max_step, bigger=True)¶

validation-based early stopping

Parameters:

value (float) – current result
best (float) – best result
cur_step (int) – the number of consecutive steps that did not exceed the best result
max_step (int) – threshold steps for stopping
bigger (bool, optional) – whether the bigger the better

Returns:

float, best result after this step
int, the number of consecutive steps that did not exceed the best result after this step
bool, whether to stop
bool, whether to update

Return type:

tuple

hopwise.utils.ensure_dir(dir_path)¶

Make sure the directory exists, if it does not exist, create it

Parameters:: dir_path (str) – directory path

hopwise.utils.get_environment(config)¶

hopwise.utils.get_flops(model, dataset, device, logger, transform, verbose=False)¶

Given a model and dataset to the model, compute the per-operator flops of the given model.

Parameters:

model – the model to compute flop counts.
dataset – dataset that are passed to model to count flops.
device – cuda.device. It is the device that the model run on.
verbose – whether to print information of modules.

Returns:

the number of flops for each operation.

Return type:

total_ops

hopwise.utils.get_gpu_usage(device=None)¶

Return the reserved memory and total memory of given device in a string.

Parameters:: device – cuda.device. It is the device that the model run on.
Returns:: it contains the info about reserved memory and total memory of given device.
Return type:: str

hopwise.utils.get_local_time()¶

Get current time

Returns:: current time
Return type:: str

hopwise.utils.get_logits_processor(model_name)¶

hopwise.utils.get_model(model_name)¶

Automatically select model class based on model name

Parameters:: model_name (str) – model name
Returns:: model class
Return type:: Recommender

hopwise.utils.get_sequence_postprocessor(postprocessor_name)¶

hopwise.utils.get_tensorboard(logger)¶

Creates a SummaryWriter of Tensorboard that can log PyTorch models and metrics into a directory for visualization within the TensorBoard UI. For the convenience of the user, the naming rule of the SummaryWriter’s log_dir is the same as the logger.

Parameters:: logger – its output filename is used to name the SummaryWriter’s log_dir. If the filename is not available, we will name the log_dir according to the current time.
Returns:: it will write out events and summaries to the event file.
Return type:: SummaryWriter

hopwise.utils.get_trainer(model_type, model_name)¶

Automatically select trainer class based on model type and model name

Parameters:

model_type (ModelType) – model type
model_name (str) – model name

Returns:

trainer class

Return type:

Trainer

hopwise.utils.init_seed(seed, reproducibility)¶

Init random seed for random functions in numpy, torch, cuda and cudnn

Parameters:

seed (int) – random seed
reproducibility (bool) – Whether to require reproducibility

hopwise.utils.list_to_latex(convert_list, bigger_flag=True, subset_columns=[])¶

class hopwise.utils.WandbLogger(config)¶

WandbLogger to log metrics to Weights and Biases.

config¶

log_wandb¶

setup()¶

log_metrics(metrics, head='train', commit=True)¶

log_eval_metrics(metrics, head='eval')¶

_set_steps()¶

_add_head_to_metrics(metrics, head)¶