hopwise.model.logits_processor¶

Common logits processor in recommender system

Classes¶

`LogitsProcessor`	Fallback LogitsProcessor if transformers is not available.
`ConstrainedLogitsProcessorWordLevel`	Force the last token to be one of the force_tokens if the total length is reached, in the path generation stage
`PrefixConstrainedLogitsProcessorWordLevel`	Force the last token to be one of the force_tokens if the total length is reached, in the path generation stage
`PLMLogitsProcessorWordLevel`	https://dl.acm.org/doi/pdf/10.1145/3485447.3511937

Module Contents¶

class hopwise.model.logits_processor.LogitsProcessor[source]¶

Fallback LogitsProcessor if transformers is not available.

__call__(input_ids, scores)[source]¶

class hopwise.model.logits_processor.ConstrainedLogitsProcessorWordLevel(tokenized_ckg, tokenized_used_ids, max_sequence_length, tokenizer, mask_cache_size=3 * 10**4, pos_candidates_cache_size=1 * 10**5, task=KnowledgeEvaluationType.REC, **kwargs)[source]¶

Bases: transformers.LogitsProcessor

Force the last token to be one of the force_tokens if the total length is reached, in the path generation stage this means to limit the hop size. This is a word-level constraint, does not work with piece tokenizers. If task is link prediction (LP) logit processor forces last token to reachable ones

tokenized_ckg¶

tokenized_used_ids¶

max_sequence_length¶

tokenizer¶

bos_token_id¶

pos_candidates_cache¶

mask_cache¶

task¶

is_bos_token_in_input(input_ids)[source]¶: Check if the input contains a BOS token. Checking the first sequence is enough.

__call__(input_ids, scores)[source]¶

process_scores_rec(input_ids, idx)[source]¶: Process each score based on input length and update mask list.

process_scores_lp(input_ids, idx)[source]¶: Process each score based on input length or skip.

is_next_token_entity(input_ids)[source]¶

get_current_key(input_ids, idx)[source]¶

get_candidates_rec(key1, key2=None)[source]¶

Parameters:

key1
key2 – if key2 is not None, it returns entity candidates, otherwise relation candidates

get_candidates_lp(key)[source]¶

get_banned_mask(key, candidate_tokens)[source]¶: Retrieve or cache the banned token mask for a specific key.

class hopwise.model.logits_processor.PrefixConstrainedLogitsProcessorWordLevel(tokenized_ckg, tokenized_used_ids, max_sequence_length, tokenizer, **kwargs)[source]¶

Bases: ConstrainedLogitsProcessorWordLevel

mask_cache = None¶

__call__(input_ids, scores)[source]¶

class hopwise.model.logits_processor.PLMLogitsProcessorWordLevel(tokenized_ckg, tokenized_used_ids, max_sequence_length, tokenizer, pos_candidates_cache_size=1 * 10**5, task=KnowledgeEvaluationType.REC, **kwargs)[source]¶

Bases: transformers.LogitsProcessor

https://dl.acm.org/doi/pdf/10.1145/3485447.3511937 Constraint decoding strategy for PLM, it forces the model to generate alternatively entities and relations

tokenized_ckg¶

tokenized_used_ids¶

max_sequence_length¶

tokenizer¶

bos_token_id¶

pos_candidates_cache¶

task¶

entity_token_ids¶

relation_token_ids¶

__call__(input_ids, scores)[source]¶

is_bos_token_in_input(input_ids)[source]¶: Check if the input contains a BOS token. Checking the first sequence is enough.

is_next_token_entity(input_ids)[source]¶

process_scores(input_ids, idx)[source]¶: Process each score based on input length and update mask to allow only entities or only relations.