genai.extensions.lm_eval.model module¶

class genai.extensions.lm_eval.model.IBMGenAILMEval[source]¶

Bases: LM

Implementation of LM model interface for evaluating GenAI model with the lm_eval framework.

See https://github.com/EleutherAI/lm-evaluation-harness/blob/main/docs/model_guide.md for reference.

DEFAULT_GENERATION_EXECUTION_OPTIONS = CreateExecutionOptions(throw_on_error=True, ordered=True, concurrency_limit=None, callback=None)¶

DEFAULT_NUM_RETRIES = 6¶

DEFAULT_TOKENIZATION_EXECUTION_OPTIONS = CreateExecutionOptions(throw_on_error=True, ordered=True, concurrency_limit=5, batch_size=100, rate_limit_options=None, callback=None)¶

__init__(client=None, model_id=None, parameters=None, show_progressbar=True, tokenization_execution_options=None, generation_execution_options=None)[source]¶

Defines the interface that should be implemented by all LM subclasses. LMs are assumed to take text (strings) as input and yield strings as output (inputs/outputs should be tokenization-agnostic.)

Parameters:

client (Client | None)
model_id (str | None)
parameters (TextGenerationParameters | None)
show_progressbar (bool | None)
tokenization_execution_options (CreateExecutionOptions | None)
generation_execution_options (CreateExecutionOptions | None)

classmethod create_from_arg_string(arg_string, additional_config=None)[source]¶

Allow the user to specify model parameters (TextGenerationParameters) in CLI arguments.

Parameters:

arg_string (str)
additional_config (dict | None)

Return type:

IBMGenAILMEval

dump_parameters()[source]¶

generate_until(requests)[source]¶

From official model_guide: https://github.com/EleutherAI/lm-evaluation-harness/blob/main/docs/model_guide.md:

Each request contains Instance.argsTuple[str, dict] containing:

an input string to the LM and

a dictionary of keyword arguments used to control generation parameters.

Using this input and these generation parameters, text will be sampled from the language model

(
typically until a maximum output length or specific stopping string sequences–for example, {“until”: [”

“, “.”], “max_gen_toks”: 128}: ). The generated input+output text from the model will then be returned.

Parameters:: requests (list[Instance])
Return type:: list[str]

loglikelihood(requests)[source]¶

Parameters:

requests (list[Instance]) –

Each request contains Instance.args : Tuple[str, str] containing: 1. an input string to the LM and 2. a target string on which the loglikelihood of the LM producing this target,

conditioned on the input, will be returned.

Returns:

loglikelihood: probability of generating the target string conditioned on the input is_greedy: True if and only if the target string would be generated by greedy sampling from the LM

Return type:

tuple (loglikelihood, is_greedy) for each request according to the input order

loglikelihood_rolling(requests)[source]¶

Used to evaluate perplexity on a data distribution.

Parameters:: requests (list[Instance]) – Each request contains Instance.args : tuple[str] containing an input string to the model whose entire loglikelihood, conditioned on purely the EOT token, will be calculated.
Returns:: loglikelihood: solely the probability of producing each piece of text given no starting input.
Return type:: tuple (loglikelihood,) for each request according to the input order

property model_token_limit¶

class genai.extensions.lm_eval.model.LogLikelihoodResult[source]¶

Bases: NamedTuple

LogLikelihoodResult(log_likelihood, is_greedy)

is_greedy: bool¶: Alias for field number 1

log_likelihood: float¶: Alias for field number 0

genai.extensions.lm_eval.model.initialize_model()[source]¶