hopwise.model.ranker

Common ranker in recommender system

Classes

BaseSequenceScoreRanker

Base class for sequence score rankers.

RankerLP

CumulativeSequenceScoreRanker

Ranker that uses the cumulative sequence score of the final max_new_tokens predicted tokens to rank sequences.

SampleSearchSequenceScoreRanker

Ranker that uses the sequence score of the beam search to rank sequences.

BeamSearchSequenceScoreRanker

Ranker that uses the sequence score of the beam search to rank sequences.

Module Contents

class hopwise.model.ranker.BaseSequenceScoreRanker(tokenizer, used_ids, item_num, topk=10)[source]

Base class for sequence score rankers.

tokenizer
used_ids
item_num
topk = 10
abstractmethod get_sequences(generation_outputs, max_new_tokens=24)[source]

This method should be implemented by subclasses to extract sequences and their scores.

parse_sequences(user_index, sequences, sequences_scores)[source]
_parse_single_sequence(scores, batch_uidx, sequence)[source]
class hopwise.model.ranker.RankerLP(tokenizer, kg_positives, K=10, max_new_tokens=24)[source]
tokenizer
kg_positives
topk
topk_sequences
max_new_tokens = 24
K = 10
update_topk(generate_outputs)[source]
reset_topks()[source]
class hopwise.model.ranker.CumulativeSequenceScoreRanker(tokenizer, used_ids, item_num, topk=10)[source]

Bases: BaseSequenceScoreRanker

Ranker that uses the cumulative sequence score of the final max_new_tokens predicted tokens to rank sequences.

calculate_sequence_scores(normalized_tuple, sequences, max_new_tokens=24)[source]
normalize_tuple(logits_tuple)[source]
get_sequences(generation_outputs, max_new_tokens=24)[source]

This method should be implemented by subclasses to extract sequences and their scores.

class hopwise.model.ranker.SampleSearchSequenceScoreRanker(tokenizer, used_ids, item_num, topk=10)[source]

Bases: BaseSequenceScoreRanker

Ranker that uses the sequence score of the beam search to rank sequences.

To use only if do_sample = True and if topk and topp are set.

get_scores(sequences, scores)[source]
get_sequences(generation_outputs, max_new_tokens=24)[source]

generation_outputs is a dataclass with 3 fields: ‘sequences’, ‘scores’ and ‘past_key_values’ sequences is a tensor of shape (num_return_sequences, sequence_length) scores is a tuple of len (|generated tokens|) where each element is a tensor

that says the logits at each timestep before applying topk and topp

class hopwise.model.ranker.BeamSearchSequenceScoreRanker(tokenizer, used_ids, item_num, topk=10)[source]

Bases: BaseSequenceScoreRanker

Ranker that uses the sequence score of the beam search to rank sequences.

get_sequences(generation_outputs, max_new_tokens=24)[source]

This method should be implemented by subclasses to extract sequences and their scores.