genai.extensions.lm_eval.model module#

class genai.extensions.lm_eval.model.IBMGenAILMEval[source]#

Bases: LM

Implementation of LM model interface for evaluating GenAI model with the lm_eval framework.

See https://github.com/EleutherAI/lm-evaluation-harness/blob/main/docs/model_guide.md for reference.

DEFAULT_GENERATION_EXECUTION_OPTIONS = CreateExecutionOptions(throw_on_error=True, ordered=True, concurrency_limit=None, callback=None)#
DEFAULT_NUM_RETRIES = 6#
DEFAULT_TOKENIZATION_EXECUTION_OPTIONS = CreateExecutionOptions(throw_on_error=True, ordered=True, concurrency_limit=5, batch_size=100, rate_limit_options=None, callback=None)#
__init__(client=None, model_id=None, parameters=None, show_progressbar=True, tokenization_execution_options=None, generation_execution_options=None)[source]#

Defines the interface that should be implemented by all LM subclasses. LMs are assumed to take text (strings) as input and yield strings as output (inputs/outputs should be tokenization-agnostic.)

Parameters:
classmethod create_from_arg_string(arg_string, additional_config=None)[source]#

Allow the user to specify model parameters (TextGenerationParameters) in CLI arguments.

Parameters:
  • arg_string (str) –

  • additional_config (dict | None) –

Return type:

IBMGenAILMEval

dump_parameters()[source]#
generate_until(requests)[source]#

From official model_guide: https://github.com/EleutherAI/lm-evaluation-harness/blob/main/docs/model_guide.md:

Each request contains Instance.argsTuple[str, dict] containing:
  1. an input string to the LM and

  2. a dictionary of keyword arguments used to control generation parameters.

Using this input and these generation parameters, text will be sampled from the language model

(

typically until a maximum output length or specific stopping string sequences–for example, {“until”: [”

“, “.”], “max_gen_toks”: 128}

). The generated input+output text from the model will then be returned.

Parameters:

requests (list[Instance]) –

Return type:

list[str]

loglikelihood(requests)[source]#
Parameters:

requests (list[Instance]) –

Each request contains Instance.args : Tuple[str, str] containing: 1. an input string to the LM and 2. a target string on which the loglikelihood of the LM producing this target,

conditioned on the input, will be returned.

Returns:

loglikelihood: probability of generating the target string conditioned on the input is_greedy: True if and only if the target string would be generated by greedy sampling from the LM

Return type:

tuple (loglikelihood, is_greedy) for each request according to the input order

loglikelihood_rolling(requests)[source]#

Used to evaluate perplexity on a data distribution.

Parameters:

requests (list[Instance]) – Each request contains Instance.args : tuple[str] containing an input string to the model whose entire loglikelihood, conditioned on purely the EOT token, will be calculated.

Returns:

loglikelihood: solely the probability of producing each piece of text given no starting input.

Return type:

tuple (loglikelihood,) for each request according to the input order

property model_token_limit#
class genai.extensions.lm_eval.model.LogLikelihoodResult[source]#

Bases: NamedTuple

LogLikelihoodResult(log_likelihood, is_greedy)

is_greedy: bool#

Alias for field number 1

log_likelihood: float#

Alias for field number 0

genai.extensions.lm_eval.model.initialize_model()[source]#