hopwise.utils

Submodules

Attributes

Classes

ModelType

Type of models.

KGDataLoaderState

States for Knowledge-based DataLoader.

EvaluatorType

Type for evaluation metrics.

InputType

Type of Models' input.

FeatureType

Type of features.

FeatureSource

Source of features.

PathLanguageModelingTokenType

Type of tokens in paths for Path Language Modeling.

GenerationOutputs

Dataclass to hold the outputs of the generation process.

WandbLogger

WandbLogger to log metrics to Weights and Biases.

Functions

init_logger(config)

A logger that can show a message on standard output and write it into the

progress_bar(*args, **kwargs)

set_color(log, color[, highlight, progress])

calculate_valid_score(valid_result[, valid_metric])

Return valid score from valid result

deep_dict_update(updated_dict, updating_dict)

dict2str(result_dict)

Convert result dict to str

early_stopping(value, best, cur_step, max_step[, bigger])

validation-based early stopping

ensure_dir(dir_path)

Make sure the directory exists, if it does not exist, create it

get_environment(config)

get_flops(model, dataset, device, logger, transform[, ...])

Given a model and dataset to the model, compute the per-operator flops

get_gpu_usage([device])

Return the reserved memory and total memory of given device in a string.

get_local_time()

Get current time

get_logits_processor(model_name)

get_model(model_name)

Automatically select model class based on model name

get_sequence_postprocessor(postprocessor_name)

get_tensorboard(logger)

Creates a SummaryWriter of Tensorboard that can log PyTorch models and metrics into a directory for

get_trainer(model_type, model_name)

Automatically select trainer class based on model type and model name

init_seed(seed, reproducibility)

Init random seed for random functions in numpy, torch, cuda and cudnn

list_to_latex(convert_list[, bigger_flag, subset_columns])

Package Contents

hopwise.utils.general_arguments = ['gpu_id', 'use_gpu', 'seed', 'reproducibility', 'state', 'data_path', 'checkpoint_dir',...
hopwise.utils.training_arguments = ['epochs', 'train_batch_size', 'learner', 'learning_rate', 'train_neg_sample_args', 'eval_step',...
hopwise.utils.evaluation_arguments = ['eval_args', 'repeatable', 'metrics', 'topk', 'valid_metric', 'valid_metric_bigger',...
hopwise.utils.dataset_arguments = ['field_separator', 'seq_separator', 'USER_ID_FIELD', 'ITEM_ID_FIELD', 'RATING_FIELD',...
class hopwise.utils.ModelType

Bases: enum.Enum

Type of models.

  • GENERAL: General Recommendation

  • SEQUENTIAL: Sequential Recommendation

  • CONTEXT: Context-aware Recommendation

  • KNOWLEDGE: Knowledge-based Recommendation

  • PATH_LANGUAGE_MODELING: Path Language Modeling Recommendation

GENERAL = 1
SEQUENTIAL = 2
CONTEXT = 3
KNOWLEDGE = 4
TRADITIONAL = 5
DECISIONTREE = 6
PATH_LANGUAGE_MODELING = 7
class hopwise.utils.KGDataLoaderState

Bases: enum.Enum

States for Knowledge-based DataLoader.

  • RSKG: Return both knowledge graph information and user-item interaction information.

  • RS: Only return the user-item interaction.

  • KG: Only return the triplets with negative examples in a knowledge graph.

RSKG = 1
RS = 2
KG = 3
class hopwise.utils.EvaluatorType

Bases: enum.Enum

Type for evaluation metrics.

  • RANKING: Ranking-based metrics like NDCG, Recall, etc.

  • VALUE: Value-based metrics like AUC, etc.

RANKING = 1
VALUE = 2
class hopwise.utils.InputType

Bases: enum.Enum

Type of Models’ input.

  • POINTWISE: Point-wise input, like uid, iid, label.

  • PAIRWISE: Pair-wise input, like uid, pos_iid, neg_iid.

  • LISTWISE: List-wise input, like uid, [iid1, iid2, ...].

  • PATHWISE: KG Path-wise input, like uid, pos_iid, eid1, eid2, next_pos_iid.

  • USERWISE: User-wise input, like uid0, uid1, ...., uidn.

POINTWISE = 1
PAIRWISE = 2
LISTWISE = 3
PATHWISE = 4
USERWISE = 5
class hopwise.utils.FeatureType

Bases: enum.Enum

Type of features.

  • TOKEN: Token features like user_id and item_id.

  • FLOAT: Float features like rating and timestamp.

  • TOKEN_SEQ: Token sequence features like review.

  • FLOAT_SEQ: Float sequence features like pretrained vector.

TOKEN = 'token'
FLOAT = 'float'
TOKEN_SEQ = 'token_seq'
FLOAT_SEQ = 'float_seq'
class hopwise.utils.FeatureSource

Bases: enum.Enum

Source of features.

  • INTERACTION: Features from .inter (other than user_id and item_id).

  • USER: Features from .user (other than user_id).

  • ITEM: Features from .item (other than item_id).

  • USER_ID: user_id feature in inter_feat and user_feat.

  • ITEM_ID: item_id feature in inter_feat and item_feat.

  • KG: Features from .kg.

  • NET: Features from .net.

INTERACTION = 'inter'
USER = 'user'
ITEM = 'item'
USER_ID = 'user_id'
ITEM_ID = 'item_id'
KG = 'kg'
NET = 'net'
class hopwise.utils.PathLanguageModelingTokenType(token, token_id)

Bases: enum.Enum

Type of tokens in paths for Path Language Modeling.

  • SPECIAL: Special tokens, like start and end of a path.

  • ENTITY: Entity tokens.

  • RELATION: Relation tokens.

  • USER: User tokens.

  • ITEM: Item tokens.

SPECIAL = ('S', 0)
ENTITY = ('E', 1)
RELATION = ('R', 2)
USER = ('U', 3)
ITEM = ('I', 4)
token
token_id
__str__()
hopwise.utils.init_logger(config)

A logger that can show a message on standard output and write it into the file named filename simultaneously. All the message that you want to log MUST be str.

Parameters:

config (Config) – An instance object of Config, used to record parameter information.

Example

>>> logger = logging.getLogger(config)
>>> logger.debug(train_state)
>>> logger.info(train_result)
hopwise.utils.progress_bar(*args, **kwargs)
hopwise.utils.set_color(log, color, highlight=True, progress=False)
class hopwise.utils.GenerationOutputs

Bases: dict

Dataclass to hold the outputs of the generation process.

sequences

The generated sequences.

Type:

torch.Tensor

scores

The scores for each generated token.

Type:

torch.Tensor

sequences: torch.Tensor
scores: torch.Tensor
__post_init__()
hopwise.utils.calculate_valid_score(valid_result, valid_metric=None)

Return valid score from valid result

Parameters:
  • valid_result (dict) – valid result

  • valid_metric (str, optional) – the selected metric in valid result for valid score

Returns:

valid score

Return type:

float

hopwise.utils.deep_dict_update(updated_dict, updating_dict)
hopwise.utils.dict2str(result_dict)

Convert result dict to str

Parameters:

result_dict (dict) – result dict

Returns:

result str

Return type:

str

hopwise.utils.early_stopping(value, best, cur_step, max_step, bigger=True)

validation-based early stopping

Parameters:
  • value (float) – current result

  • best (float) – best result

  • cur_step (int) – the number of consecutive steps that did not exceed the best result

  • max_step (int) – threshold steps for stopping

  • bigger (bool, optional) – whether the bigger the better

Returns:

  • float, best result after this step

  • int, the number of consecutive steps that did not exceed the best result after this step

  • bool, whether to stop

  • bool, whether to update

Return type:

tuple

hopwise.utils.ensure_dir(dir_path)

Make sure the directory exists, if it does not exist, create it

Parameters:

dir_path (str) – directory path

hopwise.utils.get_environment(config)
hopwise.utils.get_flops(model, dataset, device, logger, transform, verbose=False)

Given a model and dataset to the model, compute the per-operator flops of the given model.

Parameters:
  • model – the model to compute flop counts.

  • dataset – dataset that are passed to model to count flops.

  • device – cuda.device. It is the device that the model run on.

  • verbose – whether to print information of modules.

Returns:

the number of flops for each operation.

Return type:

total_ops

hopwise.utils.get_gpu_usage(device=None)

Return the reserved memory and total memory of given device in a string.

Parameters:

device – cuda.device. It is the device that the model run on.

Returns:

it contains the info about reserved memory and total memory of given device.

Return type:

str

hopwise.utils.get_local_time()

Get current time

Returns:

current time

Return type:

str

hopwise.utils.get_logits_processor(model_name)
hopwise.utils.get_model(model_name)

Automatically select model class based on model name

Parameters:

model_name (str) – model name

Returns:

model class

Return type:

Recommender

hopwise.utils.get_sequence_postprocessor(postprocessor_name)
hopwise.utils.get_tensorboard(logger)

Creates a SummaryWriter of Tensorboard that can log PyTorch models and metrics into a directory for visualization within the TensorBoard UI. For the convenience of the user, the naming rule of the SummaryWriter’s log_dir is the same as the logger.

Parameters:

logger – its output filename is used to name the SummaryWriter’s log_dir. If the filename is not available, we will name the log_dir according to the current time.

Returns:

it will write out events and summaries to the event file.

Return type:

SummaryWriter

hopwise.utils.get_trainer(model_type, model_name)

Automatically select trainer class based on model type and model name

Parameters:
  • model_type (ModelType) – model type

  • model_name (str) – model name

Returns:

trainer class

Return type:

Trainer

hopwise.utils.init_seed(seed, reproducibility)

Init random seed for random functions in numpy, torch, cuda and cudnn

Parameters:
  • seed (int) – random seed

  • reproducibility (bool) – Whether to require reproducibility

hopwise.utils.list_to_latex(convert_list, bigger_flag=True, subset_columns=[])
class hopwise.utils.WandbLogger(config)

WandbLogger to log metrics to Weights and Biases.

config
log_wandb
setup()
log_metrics(metrics, head='train', commit=True)
log_eval_metrics(metrics, head='eval')
_set_steps()
_add_head_to_metrics(metrics, head)