Skip to content

API reference

This is an automatically generated API reference of the ICX360 toolkit.

icx360

Modules:

  • algorithms –

    Module containing submodules for MExGen, CELL, and Token Highlighter explainers

  • metrics –

    Module containing metrics for explanations

  • utils –

    Module containing various utilities including model wrappers, infillers, scalarizers, and segmenters, among others

algorithms

Module containing submodules for MExGen, CELL, and Token Highlighter explainers

Modules:

  • cell –

    Module containing CELL and mCELL submodules

  • lbbe –

    File containing base class for local black box explainers

  • lwbe –

    File containing base class for local white box explainers

  • mexgen –

    Module containing submodules for MExGen C-LIME and MExGen L-SHAP explainers

  • token_highlighter –

    Module containing TokenHighlighter submodule (thllm)

cell

Module containing CELL and mCELL submodules

Modules:

  • CELL –

    File containing class CELL

  • mCELL –

    File containing class mCELL

CELL

File containing class CELL

CELL is an explainer class that contains function explain_instance that provides contrastive explanations of input instances. The algorithm for providing explanations is described as CELL in: CELL your Model: Contrastive Explanations for Large Language Models, Ronny Luss, Erik Miehling, Amit Dhurandhar. https://arxiv.org/abs/2406.11785

Classes:

  • CELL –

    Instances of CELL contain information about the LLM model being explained.

CELL
CELL(model, infiller='bart', num_return_sequences=1, scalarizer='shp', scalarizer_model_path=None, scalarizer_type='distance', generation=True, experiment_id='id', device=None)

Bases: LocalBBExplainer

Instances of CELL contain information about the LLM model being explained. These instances are used to explain LLM responses on input text using a budgeted algorithm with intelligent search strategy.

Attributes:

  • _model –

    model that we want to explain (based on icx360/utils/model_wrappers)

  • _infiller –

    string for function used to input text with a mask token and output text with mask replaced by text

  • _num_return_sequences –

    integer number of sequences returned when doing generation for mask infilling

  • _scalarizer_name –

    string of scalarizer to use to determine if a contrast is found (must be from ['shp', 'nli', 'bleu']

  • _scalarizer_type –

    string specifying either 'distance' for explaining LLM generation using distances or 'classifier' for explaining a classifier

  • _scalarizer_func –

    function used to do scalarization from icx360/utils/scalarizers

  • _generation –

    boolean specifying whether the model being explained performs true generation (as opposed to having output==input for classification)

  • _device –

    string detailing device on which to perform all operations (must be from ['cpu', 'cuda', 'mps']). should be same as model being explained

Initialize contrastive explainer.

Parameters:

  • model –

    model that we want to explain (based on icx360/utils/model_wrappers)

  • infiller (str, default: 'bart' ) –

    selects function used to input text with a mask token and output text with mask replaced by text

  • num_return_sequences (int, default: 1 ) –

    number of sequences returned when doing generation for mask infilling

  • scalarizer (str, default: 'shp' ) –

    select which scalarizer to use to determine if a contrast is found (must be from ['shp', 'nli', 'bleu'])

  • scalarizer_model_path (str, default: None ) –

    allow user to pass a model path for scalarizers (e.g., choose 'stanfordnlp/SteamSHP-flan-t5-xl' instead of default 'stanfordnlp/SteamSHP-flan-t5-large')

  • scalarizer_type (str, default: 'distance' ) –

    'distance' for explaining LLM generation using distances, 'classifier' for explaining a classifier

  • generation (bool, default: True ) –

    the model being explained performs true generation (as opposed to having output==input)

  • experiment_id (str, default: 'id' ) –

    passed to evaluate.load for certain scalarizers. This is used if several distributed evaluations share the same file system.

  • device (str, default: None ) –

    device on which to perform all operations (must be from ['cpu', 'cuda', 'mps']). should be same as model being explained

Methods:

  • explain_instance –

    Provide explanations of LLM applied to prompt input_text.

  • sample –

    Generate sample prompts based on an input prompt

  • set_params –

    Set parameters for the explainer.

  • splitTextByK –

    Split text into words.

explain_instance
explain_instance(input_text, epsilon_contrastive=0.5, split_k=1, budget=100, radius=5, alpha=0.5, info=True, ir=False, input_text_list=[''], prompt_format='Context: $$input0$$ \n\nQuestion: $$input1$$ \n\nAnswer: ', multiple_inputs=False, input_inds_modify=[0], model_params={})

Provide explanations of LLM applied to prompt input_text.

Provide a contrastive explanation by changing prompt input_text such that the new prompt generates a response that is preferred as a response to input_text much less by a certain amount. This metric can be changed based on user needs.

Parameters:

  • input_text (str) –

    input prompt to model that we want to explain

  • epsilon_contrastive (float, default: 0.5 ) –

    amount of change in response to deem a contrastive explanation

  • split_k (int, default: 1 ) –

    number of words to be split into each token that is masked together

  • budget (int, default: 100 ) –

    maximum number of queries allowed from infilling model

  • radius (int, default: 5 ) –

    radius for sampling near a previously modified token

  • alpha (float, default: 0.5 ) –

    tradeoff between exploration and exploitation. lower alpha mean more exploration, higher alpha means more exploitation

  • info (bool, default: True ) –

    True if to print output information, False otherwise

  • ir (bool, default: False ) –

    True if to do input reduction, i.e., remove tokens that cause minimal change to response until a large change occurs

  • input_text_list (str list, default: [''] ) –

    if multiple_inputs==True, then use input_text_list to feed additional text segments

  • prompt_format (str, default: 'Context: $$input0$$ \n\nQuestion: $$input1$$ \n\nAnswer: ' ) –

    format for prompt to create from input_text and input_text_list. Default is question/answering for google/flan-t5-large

  • multiple_inputs (bool, default: False ) –

    True if example requires multiple inputs and a format, i.e., uses input_text and input_text_list, False if just input_text for prompt

  • input_inds_modify (int list, default: [0] ) –

    list of which input_text segments to modify for contrastive example when multiple_inputs==True

  • model_params (dico, default: {} ) –

    additional keyword arguments for model generation (self._model.generate())

Returns:

  • result ( dico ) –

    contains various pieces of contrastive explanation including contrastive prompt, response to the contrastive prompt, response to the input prompt, and which words were modified

sample
sample(input_sample, curr_position, radius, num_samples, model_params={})

Generate sample prompts based on an input prompt

Parameters:

  • input_sample (dico) –

    contains information about a prompt including text and how it differs from the input prompt to the explainer

  • curr_position (int) –

    position of tokens from which to generate samples within a radius of

  • radius (int) –

    radius for sampling near a previously modified token

  • num_samples (int) –

    number of samples to generate

  • model_params (dico, default: {} ) –

    additional keyword arguments for model generation (self._model.generate())

Returns:

  • samples_list ( dico list ) –

    list of samples which are dictionaries with same information as input_sample

set_params
set_params(*argv, **kwargs)

Set parameters for the explainer.

splitTextByK
splitTextByK(str, k)

Split text into words.

Parameters:

  • str (str) –

    string to be split

  • k (int) –

    number of consecutive words to keep together

Returns:

  • grouped_words ( str list ) –

    list of words which when concatenated retrieves the input str

mCELL

File containing class mCELL

mCELL is an explainer class that contains function explain_instance that provides contrastive explanations of input instances. The algorithm for providing explanations is described as m-Cell in: CELL your Model: Contrastive Explanations for Large Language Models, Ronny Luss, Erik Miehling, Amit Dhurandhar. https://arxiv.org/abs/2406.11785

Classes:

  • mCELL –

    mCELL Explainer object.

mCELL
mCELL(model, infiller='bart', num_return_sequences=1, scalarizer='shp', scalarizer_model_path=None, scalarizer_type='distance', generation=True, experiment_id='id', device=None)

Bases: LocalBBExplainer

mCELL Explainer object.

Instances of mCELL contain information about the LLM model being explained. These instances are used to explain LLM responses on input text using a myopic algorithm.

Attributes:

  • _model –

    model that we want to explain (based on icx360/utils/model_wrappers)

  • _infiller –

    string for function used to input text with a mask token and output text with mask replaced by text

  • _num_return_sequences –

    integer number of sequences returned when doing generation for mask infilling

  • _scalarizer_name –

    string of scalarizer to use to determine if a contrast is found (must be from ['shp', 'nli', 'bleu']

  • _scalarizer_type –

    string specifying either 'distance' for explaining LLM generation using distances or 'classifier' for explaining a classifier

  • _scalarizer_func –

    function used to do scalarization from icx360/utils/scalarizers

  • _generation –

    boolean specifying whether the model being explained performs true generation (as opposed to having output==input for classification)

  • _device –

    string detailing device on which to perform all operations (must be from ['cpu', 'cuda', 'mps']). should be same as model being explained

Initialize contrastive explainer.

Parameters:

  • model –

    model that we want to explain (based on icx360/utils/model_wrappers)

  • infiller (str, default: 'bart' ) –

    selects function used to input text with a mask token and output text with mask replaced by text

  • num_return_sequences (int, default: 1 ) –

    number of sequences returned when doing generation for mask infilling

  • scalarizer (str, default: 'shp' ) –

    select which scalarizer to use to determine if a contrast is found (must be from ['shp', 'nli', 'bleu', 'implicit_hate', 'stigma'])

  • scalarizer_model_path (str, default: None ) –

    allow user to pass a model path for

  • scalarizer_type (str, default: 'distance' ) –

    'distance' for explaining LLM generation using distances, 'classifier' for explaining a classifier

  • generation (bool, default: True ) –

    the model being explained performs true generation (as opposed to having output==input)

  • experiment_id (str, default: 'id' ) –

    passed to evaluate.load for certain scalarizers. This is used if several distributed evaluations share the same file system.

  • device (str, default: None ) –

    device on which to perform all operations (must be from ['cpu', 'cuda', 'mps']). should be same as model being explained

Methods:

  • explain_instance –

    Provide explanations of large language model applied to prompt input_text

  • set_params –

    Set parameters for the explainer.

  • splitTextByK –

    Split text into words.

explain_instance
explain_instance(input_text, epsilon_contrastive=0.5, epsilon_iter=0.001, split_k=1, no_change_max_iters=3, info=True, ir=False, model_params={})

Provide explanations of large language model applied to prompt input_text

Provide a contrastive explanation by changing prompt input_text such that the new prompt generates a response that is preferred as a response to input_text much less by a certain amount. This metric can be changed based on user needs.

Parameters:

  • input_text (str) –

    input prompt to model that we want to explain

  • epsilon_contrastive (float, default: 0.5 ) –

    amount of change in response to deem a contrastive explanation

  • epsilon_iter (float, default: 0.001 ) –

    minimum amount of change between iterations to continue search

  • split_k (int, default: 1 ) –

    number of words to be split into each token that is masked together

  • info (boolean, default: True ) –

    True if to print output information, False otherwise

  • ir (boolean, default: False ) –

    True if to do input reduction, i.e., remove tokens that cause minimal change to response until a large change occurs

Returns:

  • result ( dico ) –

    contains various pieces of contrastive explanation including contrastive prompt, response to the contrastive prompt, response to the input prompt, and which words were modified

set_params
set_params(*argv, **kwargs)

Set parameters for the explainer.

splitTextByK
splitTextByK(str, k)

Split text into words.

Parameters:

  • str (str) –

    string to be split

  • k (int) –

    number of consecutive words to keep together

Returns:

  • grouped_words ( str list ) –

    list of words which when concatenated retrieves the input str

lbbe

File containing base class for local black box explainers

Attributes:

  • ABC –

    Ensure compatibility of Abstract Base Class with Python versions

Classes:

  • LocalBBExplainer –

    LocalBBExplainer is the base class for local post-hoc black-box explainers (LBBE).

ABC module-attribute
ABC = ABC
LocalBBExplainer
LocalBBExplainer(*argv, **kwargs)

Bases: ABC

LocalBBExplainer is the base class for local post-hoc black-box explainers (LBBE). Such explainers are model agnostic and generally require access to model's predict function alone. Examples include LIME[#1], SHAP[#2], etc..

References

.. [#1] “Why Should I Trust You?” Explaining the Predictions of Any Classifier, ACM SIGKDD 2016. Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin. https://arxiv.org/abs/1602.04938. .. [#2] A Unified Approach to Interpreting Model Predictions, NIPS 2017. Lundberg, Scott M and Lee, Su-In. https://arxiv.org/abs/1705.07874

Initialize a LocalBBExplainer object.

Methods:

explain_instance abstractmethod
explain_instance(*argv, **kwargs)

Explain an input instance x.

set_params
set_params(*argv, **kwargs)

Set parameters for the explainer.

lwbe

File containing base class for local white box explainers

Attributes:

  • ABC –

    Ensure compatibility of Abstract Base Class with Python versions

Classes:

  • LocalWBExplainer –

    LocalWBExplainer is the base class for local post-hoc white box explainers (LWBE).

ABC module-attribute
ABC = ABC
LocalWBExplainer
LocalWBExplainer(*argv, **kwargs)

Bases: ABC

LocalWBExplainer is the base class for local post-hoc white box explainers (LWBE). Such explainers generally require access to model's internals beyond its predict function. Examples include Contrastive explanation method[#1], Layer-wise Relevance Propagation[#2], etc.

References

.. [#] Explanations based on the Missing: Towards Contrastive Explanations with Pertinent Negatives, NIPS, 2018. Amit Dhurandhar, Pin-Yu Chen, Ronny Luss, Chun-Chen Tu, Paishun Ting, Karthikeyan Shanmugam, Payel Das. https://arxiv.org/abs/1802.07623 .. [#2] http://www.heatmapping.org/

Constructor method, initialize a LocalWBExplainer object.

Methods:

explain_instance abstractmethod
explain_instance(*argv, **kwargs)

Explain an input instance x.

set_params
set_params(*argv, **kwargs)

Set parameters for the explainer.

mexgen

Module containing submodules for MExGen C-LIME and MExGen L-SHAP explainers

Modules:

  • clime –

    Class and supporting functions for MExGen C-LIME explainer.

  • lshap –

    Class and supporting functions for MExGen L-SHAP explainer.

clime

Class and supporting functions for MExGen C-LIME explainer.

The MExGen framework and C-LIME algorithm are described in

Multi-Level Explanations for Generative Language Models. Lucas Monteiro Paes and Dennis Wei et al. The 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025). https://arxiv.org/abs/2403.14459

Classes:

  • CLIME –

    MExGen C-LIME explainer

Functions:

CLIME
CLIME(model, segmenter='en_core_web_trf', scalarizer='prob', **kwargs)

Bases: LocalBBExplainer

MExGen C-LIME explainer

Attributes:

  • model (Model) –

    Model to explain, wrapped in an icx360.utils.model_wrappers.Model object.

  • segmenter (SpaCySegmenter) –

    Object for segmenting input text into units using a spaCy model.

  • scalarized_model (Scalarizer) –

    "Scalarized model" that further wraps model with a method for computing scalar values based on the model's inputs or outputs.

Initialize MExGen C-LIME explainer.

Parameters:

  • model (Model) –

    Model to explain, wrapped in an icx360.utils.model_wrappers.Model object.

  • segmenter (str, default: 'en_core_web_trf' ) –

    Name of spaCy model to use in segmenter (icx360.utils.segmenters.SpaCySegmenter).

  • scalarizer (str, default: 'prob' ) –

    Type of scalarizer to use. "prob": probability of generating original output conditioned on perturbed inputs (instantiates an icx360.utils.scalarizers.ProbScalarizedModel). "text": similarity scores between original output and perturbed outputs (instantiates an icx360.utils.scalarizers.TextScalarizedModel).

  • **kwargs (dict, default: {} ) –

    Additional keyword arguments for initializing scalarizer.

Raises:

  • ValueError –

    If scalarizer is not "prob" or "text".

Methods:

  • explain_instance –

    Explain model output by attributing it to parts of the input text.

  • set_params –

    Set parameters for the explainer.

model instance-attribute
model = model
scalarized_model instance-attribute
scalarized_model = ProbScalarizedModel(model)
segmenter instance-attribute
segmenter = SpaCySegmenter(segmenter)
explain_instance
explain_instance(input_orig, unit_types='p', ind_segment=True, segment_type='s', max_phrase_length=10, model_params={}, scalarize_params={}, oversampling_factor=10, max_units_replace=2, empty_subset=True, replacement_str='', num_nonzeros=None, debias=True)

Explain model output by attributing it to parts of the input text.

Uses an algorithm called C-LIME (a variant of LIME) to fit a local linear approximation to the model and compute attribution scores.

Parameters:

  • input_orig (str or List[str]) –

    [input] Input text as a single unit (if str) or segmented sequence of units (List[str]).

  • unit_types (str or List[str], default: 'p' ) –

    [input] Types of units in input_orig. "p" for paragraph, "s" for sentence, "w" for word, "n" for not to be perturbed/attributed to. If str, applies to all units in input_orig, otherwise unit-specific.

  • ind_segment (bool or List[bool], default: True ) –

    [segmentation] Whether to segment input text. If bool, applies to all units; if List[bool], applies to each unit individually.

  • segment_type (str, default: 's' ) –

    [segmentation] Type of units to segment into: "s" for sentences, "w" for words, "ph" for phrases.

  • max_phrase_length (int, default: 10 ) –

    [segmentation] Maximum phrase length in terms of spaCy tokens (default 10).

  • model_params (dict, default: {} ) –

    Additional keyword arguments for model generation (for the self.model.generate() method).

  • scalarize_params (dict, default: {} ) –

    Additional keyword arguments for computing scalar outputs (for the self.scalarized_model.scalarize_output() method).

  • oversampling_factor (float, default: 10 ) –

    [perturbation] Ratio of number of perturbed inputs to be generated to number of units that can be perturbed.

  • max_units_replace (int, default: 2 ) –

    [perturbation] Maximum number of units to perturb at one time (default 2).

  • empty_subset (bool, default: True ) –

    [perturbation] Whether to include empty subset of units to perturb (default True).

  • replacement_str (str, default: '' ) –

    [perturbation] String to replace units with (default "" for dropping units).

  • num_nonzeros (int or None, default: None ) –

    [linear model] Number of non-zero coefficients in linear model (default None means dense model).

  • debias (bool, default: True ) –

    [linear model] Refit linear model with no penalty after selecting features (default True).

Returns:

  • output_dict ( dict ) –

    Dictionary with the following items: "attributions" (dict): Dictionary with attribution scores, corresponding input units, and unit types. "output_orig" (icx360.utils.model_wrappers.GeneratedOutput): Output object generated from original input. "intercept" (float or dict[float]): Intercept(s) of linear model.

    Items in "attributions" dictionary: "units" (List[str]): input_orig segmented into units if not already, otherwise same as original. "unit_types" (List[str]): Types of units. score_label ((num_units,) np.ndarray): One or more sets of attribution scores (labelled by the type of scalarizer).

set_params
set_params(*argv, **kwargs)

Set parameters for the explainer.

compute_linear_model_features
compute_linear_model_features(subsets_replace, num_units)

Compute features used by explanatory linear model.

This function generates a feature matrix for a linear model that explains the impact of perturbing specific input units.

Parameters:

  • subsets_replace (List[List[int]]) –

    A list of subsets, where each subset is a list of indices corresponding to the units that have been replaced.

  • num_units (int) –

    Total number of units.

Returns:

  • features ( (num_perturb, num_units) np.ndarray ) –

    Matrix of feature values, equal to 1 if the unit is part of the perturbed subset, and 0 otherwise.

fit_linear_model
fit_linear_model(features, target, sample_weights, num_nonzeros, debias)

Fit explanatory linear model.

Parameters:

  • features (num_perturb, num_units) np.ndarray) –

    Feature values.

  • target (num_perturb,) np.ndarray) –

    Target values to predict.

  • sample_weights (num_perturb,) np.ndarray) –

    Sample weights.

  • num_nonzeros (int or None) –

    Number of non-zero coefficients desired in linear model, None means dense model.

  • debias (bool) –

    Refit linear model with no penalty after selecting features.

Returns:

  • coef ( (num_units,) np.ndarray ) –

    Coefficients of linear model.

  • intercept ( float ) –

    Intercept of linear model.

  • num_nonzeros ( int ) –

    Actual number of non-zero coefficients.

lshap

Class and supporting functions for MExGen L-SHAP explainer.

The MExGen framework and L-SHAP algorithm are described in

Multi-Level Explanations for Generative Language Models. Lucas Monteiro Paes and Dennis Wei et al. The 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025). https://arxiv.org/abs/2403.14459

Classes:

  • LSHAP –

    MExGen L-SHAP explainer

Functions:

LSHAP
LSHAP(model, segmenter='en_core_web_trf', scalarizer='prob', **kwargs)

Bases: LocalBBExplainer

MExGen L-SHAP explainer

Attributes:

  • model (Model) –

    Model to explain, wrapped in an icx360.utils.model_wrappers.Model object.

  • segmenter (SpaCySegmenter) –

    Object for segmenting input text into units using a spaCy model.

  • scalarized_model (Scalarizer) –

    "Scalarized model" that further wraps model with a method for computing scalar values based on the model's inputs or outputs.

Initialize MExGen L-SHAP explainer.

Parameters:

  • model (Model) –

    Model to explain, wrapped in an icx360.utils.model_wrappers.Model object.

  • segmenter (str, default: 'en_core_web_trf' ) –

    Name of spaCy model to use in segmenter (icx360.utils.segmenters.SpaCySegmenter).

  • scalarizer (str, default: 'prob' ) –

    Type of scalarizer to use. "prob": probability of generating original output conditioned on perturbed inputs (instantiates an icx360.utils.scalarizers.ProbScalarizedModel). "text": similarity scores between original output and perturbed outputs (instantiates an icx360.utils.scalarizers.TextScalarizedModel).

  • **kwargs (dict, default: {} ) –

    Additional keyword arguments for initializing scalarizer.

Raises:

  • ValueError –

    If scalarizer is not "prob" or "text".

Methods:

  • explain_instance –

    Explain model output by attributing it to parts of the input text.

  • set_params –

    Set parameters for the explainer.

model instance-attribute
model = model
scalarized_model instance-attribute
scalarized_model = ProbScalarizedModel(model)
segmenter instance-attribute
segmenter = SpaCySegmenter(segmenter)
explain_instance
explain_instance(input_orig, unit_types='p', ind_interest=None, ind_segment=True, segment_type='s', max_phrase_length=10, model_params={}, scalarize_params={}, num_neighbors=2, max_units_replace=2, replacement_str='')

Explain model output by attributing it to parts of the input text.

Uses an algorithm called L-SHAP (a variant of SHAP) that computes approximate Shapley values as attribution scores.

Parameters:

  • input_orig (str or List[str]) –

    [input] Input text as a single unit (if str) or segmented sequence of units (List[str]).

  • unit_types (str or List[str], default: 'p' ) –

    [input] Types of units in input_orig. "p" for paragraph, "s" for sentence, "w" for word, "n" for not to be perturbed/attributed to. If str, applies to all units in input_orig, otherwise unit-specific.

  • ind_interest (bool or List[bool] or None, default: None ) –

    [input] Indicator of units to attribute to ("of interest"). Default None means np.array(unit_types) != "n".

  • ind_segment (bool or List[bool], default: True ) –

    [segmentation] Whether to segment input text. If bool, applies to all units; if List[bool], applies to each unit individually.

  • segment_type (str, default: 's' ) –

    [segmentation] Type of units to segment into: "s" for sentences, "w" for words, "ph" for phrases.

  • max_phrase_length (int, default: 10 ) –

    [segmentation] Maximum phrase length in terms of spaCy tokens (default 10).

  • model_params (dict, default: {} ) –

    Additional keyword arguments for model generation (for the self.model.generate() method).

  • scalarize_params (dict, default: {} ) –

    Additional keyword arguments for computing scalar outputs (for the self.scalarized_model.scalarize_output() method).

  • num_neighbors (int, default: 2 ) –

    [perturbation] Number of neighbors on either side of unit of interest that can be perturbed. Default 2 (as an example) means two neighbors to the left AND two neighbors to the right.

  • max_units_replace (int, default: 2 ) –

    [perturbation] Maximum number of units to perturb at one time (default 2).

  • replacement_str (str, default: '' ) –

    [perturbation] String to replace units with (default "" for dropping units).

Returns:

  • output_dict ( dict ) –

    Dictionary with the following items: "attributions" (dict): Dictionary with attribution scores, corresponding input units, and unit types. "output_orig" (icx360.utils.model_wrappers.GeneratedOutput): Output object generated from original input.

    Items in "attributions" dictionary: "units" (List[str]): input_orig segmented into units if not already, otherwise same as original. "unit_types" (List[str]): Types of units. score_label ((num_units,) np.ndarray): One or more sets of attribution scores (labelled by the type of scalarizer).

set_params
set_params(*argv, **kwargs)

Set parameters for the explainer.

adapt_replacement_set
adapt_replacement_set(idx_replace, idx_interest, num_neighbors)

Adapt set of units that can be replaced to the unit of interest.

This function modifies the indices of units that can be replaced to exclude the unit of interest and include neighbors within a specified range on either side.

Parameters:

  • idx_replace (np.ndarray of dtype int) –

    Indices of units that can be replaced.

  • idx_interest (int) –

    Index of the unit of interest.

  • num_neighbors (int) –

    Number of neighbors on either side of the unit of interest to include.

Returns:

  • idx_replace_adapted ( np.ndarray of dtype int ) –

    Adapted version of idx_replace, excluding the unit of interest and including neighbors.

get_normalization_constants
get_normalization_constants(num_can_replace, max_units_replace)

Computes normalization constants for Shapley value calculation.

Parameters:

  • num_can_replace (int) –

    The total number of units that can be replaced.

  • max_units_replace (int) –

    The maximum number of units that can be replaced at one time.

Returns:

  • normalization ( ndarray ) –

    An array of normalization constants.

token_highlighter

Module containing TokenHighlighter submodule (thllm)

Modules:

  • th_llm –

    Class for TokenHilighter explainer (TH-LLM).

th_llm

Class for TokenHilighter explainer (TH-LLM). Interpreting LLMs based on the importance analysis of input text units.

Classes:

TokenHighlighter
TokenHighlighter(model, tokenizer, segmenter, **kwargs)

Bases: LocalWBExplainer

Class for TokenHilighter explainer (TH-LLM). Interpreting LLMs based on the importance analysis of input text units.

Initialize the TH-LLM explainer.

Parameters:

  • model –

    The large language model object.

  • tokenizer –

    The tokenizer object.

  • segmenter –

    The segmenter object.

  • affirmation –

    The affirmation sentence template.

  • pooling –

    The aggregation method ("norm_mean", "mean_norm", or "matrix").

Methods:

Attributes:

m instance-attribute
m = model
pooling instance-attribute
pooling: str = get('pooling', 'mean_norm')
segmenter instance-attribute
segmenter = SpaCySegmenter(segmenter)
tok instance-attribute
tok = tokenizer
token_ids instance-attribute
token_ids = _get_token_ids(prefix, infix, affirmation, suffix)
explain_instance
explain_instance(input_orig, unit_types, ind_segment, segment_type, **kwargs)

Compute importance scores for each text unit.

Parameters:

  • input_orig (str) –

    Original input text.

  • unit_types (Union[str, List[str]]) –

    Type(s) of each text unit.

  • ind_segment (Union[bool, List[bool]]) –

    Whether to segment.

  • segment_type (str) –

    Type of segmentation to apply.

  • max_phrase_length (int) –

    Max length allowed for a phrase.

Returns:

  • –

    Dict[str, Any]: Attribution information dictionary.

explain_instance_matrix
explain_instance_matrix(units: List[str]) -> Tuple[List[str], List[float]]

Use the Frobenius norm of the token gradient matrix as the importance score for each unit.

Parameters:

  • units (List[str]) –

    A list of text units (e.g., phrases or words) that form the prompt.

Returns:

  • Tuple[List[str], List[float]] –

    Tuple[List[str], List[float]]: A tuple containing: - The list of units. - The unit scores based on Frobenius norms of token gradients.

explain_instance_mean_norm
explain_instance_mean_norm(units: List[str]) -> Tuple[List[str], List[float]]

Use the average of the L2 norms of token gradients as the importance score for each unit.

Parameters:

  • units (List[str]) –

    A list of text units (e.g., phrases or words) that form the prompt.

Returns:

  • Tuple[List[str], List[float]] –

    Tuple[List[str], List[float]]: A tuple containing: - The list of units. - The unit scores based on the mean_norm method.

explain_instance_norm_mean
explain_instance_norm_mean(units: List[str]) -> Tuple[List[str], List[float]]

Use the L2 norm of the average of token gradients as the importance score for each unit.

Parameters:

  • units (List[str]) –

    A list of text units (e.g., phrases or words) that form the prompt.

Returns:

  • Tuple[List[str], List[float]] –

    Tuple[List[str], List[float]]: A tuple containing: - The list of units. - The unit scores based on the norm_mean method.

set_params
set_params(*argv, **kwargs)

Set parameters for the explainer.

metrics

Module containing metrics for explanations

Modules:

  • perturb_curve –

    Perturbation curve evaluator for measuring the fidelity of input attributions to the explained model.

perturb_curve

Perturbation curve evaluator for measuring the fidelity of input attributions to the explained model.

The PerturbCurveEvaluator class evaluates perturbation curves for input attribution scores produced by icx360.algorithms.mexgen.CLIME.explain_instance() or icx360.algorithms.mexgen.LSHAP.explain_instance(). It thus evaluates the fidelity of these attribution scores to the explained model.

Classes:

  • PerturbCurveEvaluator –

    Perturbation curve evaluator for measuring the fidelity of input attributions to the explained model.

PerturbCurveEvaluator
PerturbCurveEvaluator(model, scalarizer='prob', **kwargs)

Perturbation curve evaluator for measuring the fidelity of input attributions to the explained model.

Attributes:

  • model (Model) –

    Model to explain, wrapped in an icx360.utils.model_wrappers.Model object.

  • scalarized_model (Scalarizer) –

    "Scalarized model" that further wraps model with a method for computing scalar values based on the model's inputs or outputs.

Initialize perturbation curve evaluator.

Parameters:

  • model (Model) –

    Model to explain, wrapped in an icx360.utils.model_wrappers.Model object.

  • scalarizer (str, default: 'prob' ) –

    Type of scalarizer to use. "prob": probability of generating original output conditioned on perturbed inputs (instantiates an icx360.utils.scalarizers.ProbScalarizedModel). "text": similarity scores between original output and perturbed outputs (instantiates an icx360.utils.scalarizers.TextScalarizedModel).

  • **kwargs (dict, default: {} ) –

    Additional keyword arguments for initializing scalarizer.

Raises:

  • ValueError –

    If scalarizer is not "prob" or "text".

Methods:

model instance-attribute
model = model
scalarized_model instance-attribute
scalarized_model = ProbScalarizedModel(model)
eval_perturb_curve
eval_perturb_curve(explainer_dict, score_label, token_frac=False, max_frac_perturb=0.5, replacement_str='', model_params={}, scalarize_params={})

Evaluate perturbation curve for given input attributions.

This method evaluates the perturbation curve for a set of attribution scores by perturbing units in decreasing order of their attribution scores.

Parameters:

  • explainer_dict (dict) –

    Attribution dictionary as produced by icx360.algorithms.mexgen.CLIME.explain_instance() or icx360.algorithms.mexgen.LSHAP.explain_instance().

  • score_label (str) –

    Label of the attribution score to use for ranking units.

  • token_frac (bool, default: False ) –

    Whether to consider the number of tokens in each unit when ranking and perturbing units. Defaults to False.

  • max_frac_perturb (float, default: 0.5 ) –

    Maximum fraction of units or tokens to perturb. Defaults to 0.5.

  • replacement_str (str, default: '' ) –

    String to replace perturbed units with. Defaults to "" for dropping units.

  • model_params (dict, default: {} ) –

    Additional keyword arguments for model generation (for the self.model.generate() method).

  • scalarize_params (dict, default: {} ) –

    Additional keyword arguments for computing scalar outputs (for the self.scalarized_model.scalarize_output() method).

Returns:

  • output_perturbed ( dict ) –

    Dictionary with the following items: "frac" (torch.Tensor): Fractions of units or tokens perturbed. score_label (torch.Tensor): One or more Tensors of scalarized output values corresponding to the fractions in the "frac" Tensor. score_label labels each Tensor with the type of scalarizer.

Raises:

  • ValueError –

    If token_frac is True and model's tokenizer is not available.

utils

Module containing various utilities including model wrappers, infillers, scalarizers, and segmenters, among others

Modules:

  • coloring_utils –

    Utilities for coloring and displaying units of text.

  • general_utils –

    File containing general utility functions

  • infillers –
  • model_wrappers –

    Module containing wrappers for different types of models (used by MExGen and CELL).

  • scalarizers –

    Module containing scalarizers, which compute scalar output values based on the outputs or inputs of an LLM.

  • segmenters –

    Module containing utilities for segmenting input text into units.

  • subset_utils –

    Utilities that deal with subsets of input units.

  • toma –

    Model inference utilities that use the toma package to avoid running out of CUDA memory.

coloring_utils

Utilities for coloring and displaying units of text.

Functions:

Attributes:

COLOR_LIST_IBM_30 module-attribute
COLOR_LIST_IBM_30 = ['#a6c8ff', '#c6c6c6', '#ffb3b8']
COLOR_LIST_IBM_40 module-attribute
COLOR_LIST_IBM_40 = ['#78a9ff', '#c6c6c6', '#ff8389']
color_units
color_units(units, scores, norm_factor=None, scale_sqrt=True, color_list=COLOR_LIST_IBM_40, show=True)

Color units of text according to scores and display.

Parameters:

  • units (num_units,) np.ndarray) –

    Units of text.

  • scores (num_units,) np.ndarray) –

    Scores corresponding to units.

  • norm_factor (float or None, default: None ) –

    Factor to divide scores by to normalize them. None (default) means np.abs(scores).max().

  • scale_sqrt (bool, default: True ) –

    Whether to apply square root to magnitude of score

  • color_list (List[str], default: COLOR_LIST_IBM_40 ) –

    List of colors for matplotlib.colors.LinearSegmentedColormap

  • show (bool, default: True ) –

    Show on screen if True, otherwise return list of HTML strings.

Returns:

  • colored_units ( List[str] or None ) –

    List of HTML-formatted units of text if show==False, otherwise None.

highlight_text
highlight_text(unit, color)

general_utils

File containing general utility functions

Functions:

  • fix_seed –

    Fix a random seeed for all random number generators (random, numpy, torch)

  • select_device –

    Select device on which to perform all operations.

fix_seed
fix_seed(seed=12345)

Fix a random seeed for all random number generators (random, numpy, torch)

Parameters:

  • seed –

    seed to set for all randomizations

select_device
select_device()

Select device on which to perform all operations.

Returns:

  • device ( str ) –

    device on which to perform all operations according to user system

infillers

Modules:

BART_infiller

File containing class BART_infiller

BART_infiller is used to perform infilling using a BART LLM.

Classes:

BART_infiller
BART_infiller(model_path='facebook/bart-large', device='cuda')

BART_infiller object.

Instances can be used to encode, decode, and generate text to infill masks in text.

Attributes _model: BART model used for infilling _tokenizer: BART tokenizer mask_string: text that represents a mask for BART mask_string_encoded: encoded version of mask for BART mask_filled_error: text representing that an infilling error occurred

Initialize BART infilling object.

Parameters:

  • model_path (str, default: 'facebook/bart-large' ) –

    name of BART model to be used for infilling

Methods:

  • decode –

    Function to decode text via BART tokenizer

  • encode –

    Function to encode text via BART tokenizer

  • generate –

    Generate text to infill mask tokens. Assumes one of tokens is

  • get_infilled_mask –

    Retrieve text that replaced when infilling from generation

  • similar –

    Determine if word is similar to fill_in

Attributes:

mask_filled_error instance-attribute
mask_filled_error = '!!abcxyz!!'
mask_string instance-attribute
mask_string = '<mask>'
mask_string_encoded instance-attribute
mask_string_encoded = encode(mask_string, add_special_tokens=False)[0]
decode
decode(tokens, skip_special_tokens=True)

Function to decode text via BART tokenizer

Parameters:

  • tokens (int list) –

    token indices

  • skip_special_tokens (bool, default: True ) –

    True if to skip special tokens in decoding

Returns:

  • ret ( str ) –

    string frame decoding all input tokens

encode
encode(text, add_special_tokens=False)

Function to encode text via BART tokenizer

Parameters:

  • text (str) –

    string to encode

  • add_special_tokens (bool, default: False ) –

    True if to use special tokens in encoding

Returns:

  • ret ( int list ) –

    token indices where n is based on input text

generate
generate(tokens, num_return_sequences=1, masked_word='', return_mask_filled=False)

Generate text to infill mask tokens. Assumes one of tokens is which is token id self.mask_string_encoded Args: tokens (int list): token indices num_return_sequences (int): number of generations to return (default: 1) masked_word (str): word that is masked in tokens (default: '') return_mask_filled (bool): if true, return (ret, mask_filled), else return only ret

Returns:

  • ret ( int list ) –

    list of token indices after calling model.generate on input tokens

  • mask_filled ( str ) –

    decoded version of infilled texts

get_infilled_mask
get_infilled_mask(x_enc, y_enc, return_tokens=False)

Retrieve text that replaced when infilling from generation output

Parameters:

  • x_enc (int list) –

    token indices where one token is , i.e. input to generation function

  • y_enc (int list) –

    token indices representing same as x_enc with several tokens replacing , i.e., output of generation function

  • return_tokens (bool, default: False ) –

    if true, return (mask_filled, inds_infill), else return only mask_filled

Returns:

  • mask_filled ( str ) –

    decoded tokens that replace in y_enc from x_enc

  • inds_infill ( int list ) –

    token indices representing encoded version of infilled text

similar
similar(word, fill_in)

Determine if word is similar to fill_in

Parameters:

  • word (str) –

    words to search for

  • fill_in (str) –

    filled in text to search for word in

Returns:

  • ret ( bool ) –

    True if word is similar to fill_in, False otherwise

T5_infiller

File containing class T5_infiller

T5_infiller is used to perform infilling using a T5 LLM.

Classes:

T5_infiller
T5_infiller(model_path='t5-large', device='cuda')

T5_infiller object.

Instances can be used to encode, decode, and generate text to infill masks in text.

Attributes _model: T5 model used for infilling _tokenizer: T5 tokenizer mask_string: text that represents beginning of mask for T5 mask_string_end: text that represent end of mask for T5 mask_string_encoded: encoded version of mask_string for T5 mask_string_end_encoded: encoded version of mask_string_end for T5 mask_filled_error: text representing that an infilling error occurred

Initialize T5 infilling object.

Parameters:

  • model_path (str, default: 't5-large' ) –

    name of T5 model to be used for infilling

Methods:

  • decode –

    Function to decode text via T5 tokenizer

  • encode –

    Function to encode text via T5 tokenizer

  • generate –

    Generate text to infill mask tokens. Assumes one of tokens is

  • get_infilled_mask –

    Retrieve text that replaced when infilling from generation

  • similar –

    Determine if word is similar to fill_in

Attributes:

mask_filled_error instance-attribute
mask_filled_error = '!!abcxyz!!'
mask_string instance-attribute
mask_string = '<extra_id_0>'
mask_string_encoded instance-attribute
mask_string_encoded = encode(mask_string, add_special_tokens=False)[0]
mask_string_end instance-attribute
mask_string_end = '<extra_id_1>'
mask_string_end_encoded instance-attribute
mask_string_end_encoded = encode(mask_string_end, add_special_tokens=False)[0]
decode
decode(tokens, skip_special_tokens=True)

Function to decode text via T5 tokenizer

Parameters:

  • tokens (int list) –

    token indices

  • skip_special_tokens (bool, default: True ) –

    True if to skip special tokens in decoding

Returns:

  • ret ( str ) –

    string frame decoding all input tokens

encode
encode(text, add_special_tokens=False)

Function to encode text via T5 tokenizer

Parameters:

  • text (str) –

    string to encode

  • add_special_tokens (bool, default: False ) –

    True if to use special tokens in encoding

Returns:

  • ret ( int list ) –

    token indices where n is based on input text

generate
generate(tokens, num_return_sequences=1, masked_word='', return_mask_filled=False)

Generate text to infill mask tokens. Assumes one of tokens is which is token id self.mask_string_encoded Args: tokens (int list): token indices num_return_sequences (int): number of generations to return (default: 1) masked_word (str): word that is masked in tokens (default: '') return_mask_filled (bool): if true, return (ret, mask_filled), else return only ret

Returns:

  • ret ( int list ) –

    list of token indices after calling model.generate on input tokens

  • mask_filled ( str ) –

    decoded version of infilled texts

get_infilled_mask
get_infilled_mask(x_enc, y_enc, return_tokens=False)

Retrieve text that replaced when infilling from generation output

Parameters:

  • x_enc (int list) –

    token indices where one token is , i.e. input to generation function

  • y_enc (int list) –

    token indices representing same as x_enc with several tokens replacing , i.e., output of generation function

  • return_tokens (bool, default: False ) –

    if true, return (mask_filled, inds_infill), else return only mask_filled

Returns:

  • mask_filled ( str ) –

    decoded tokens that replace which are between and in x_enc

  • inds_infill ( int list ) –

    tokens that represent the replacement for the mask (only returned if return_tokens==True)

similar
similar(word, fill_in)

Determine if word is similar to fill_in

Parameters:

  • word (str) –

    words to search for

  • fill_in (str) –

    filled in text to search for word in

Returns:

  • ret ( bool ) –

    True if word is similar to fill_in, False otherwise

model_wrappers

Module containing wrappers for different types of models (used by MExGen and CELL).

Modules:

  • base_model_wrapper –

    Base class for model wrappers and class for model-generated outputs.

  • huggingface –

    Wrapper for HuggingFace models.

  • vllm –

    Wrapper for VLLM models.

base_model_wrapper

Base class for model wrappers and class for model-generated outputs.

Classes:

  • GeneratedOutput –

    Holds outputs of generate() method.

  • Model –

    Base class for wrappers of different types of models.

GeneratedOutput
GeneratedOutput(output_ids=None, output_text=None, output_token_count=None, logits=None)

Holds outputs of generate() method.

Attributes:

  • output_ids (Tensor or None) –

    Generated token IDs for each input.

  • output_text (List[str] or None) –

    Generated text for each input.

  • output_token_count (int or None) –

    Maximum number of generated tokens.

  • logits (Tensor or None) –

    Output logits for each input.

Initialize GeneratedOutput.

Parameters:

  • output_ids (Tensor or None, default: None ) –

    Generated token IDs for each input.

  • output_text (List[str] or None, default: None ) –

    Generated text for each input.

  • output_token_count (int or None, default: None ) –

    Maximum number of generated tokens.

  • logits (Tensor or None, default: None ) –

    Output logits for each input.

logits instance-attribute
logits = logits
output_ids instance-attribute
output_ids = output_ids
output_text instance-attribute
output_text = output_text
output_token_count instance-attribute
output_token_count = output_token_count
Model
Model(model)

Bases: ABC

Base class for wrappers of different types of models.

Attributes:

  • _model –

    Underlying model object.

Initialize Model wrapper.

Parameters:

  • model –

    Underlying model object.

Methods:

  • convert_input –

    Convert input(s) as needed for the model type.

  • generate –

    Generate response from model.

convert_input
convert_input(inputs)

Convert input(s) as needed for the model type.

Parameters:

  • inputs (str or List[str] or List[List[str]]) –

    A single input text, a list of input texts, or a list of segmented texts.

Returns:

  • inputs ( type required by model ) –

    Converted inputs.

generate abstractmethod
generate(inputs, text_only=True, **kwargs)

Generate response from model.

Parameters:

  • inputs (str or List[str] or List[List[str]]) –

    A single input text, a list of input texts, or a list of segmented texts.

  • text_only (bool, default: True ) –

    Return only generated text (default) or an object containing additional outputs.

  • **kwargs (dict, default: {} ) –

    Additional keyword arguments for model.

Returns:

  • output_obj ( List[str] or GeneratedOutput ) –

    If text_only == True, a list of generated texts corresponding to inputs. If text_only == False, a GeneratedOutput object to hold outputs.

huggingface

Wrapper for HuggingFace models.

Classes:

  • HFModel –

    Wrapper for HuggingFace models.

HFModel
HFModel(model, tokenizer)

Bases: Model

Wrapper for HuggingFace models.

Attributes:

  • _model (transformers model object) –

    Underlying model object.

  • _tokenizer (transformers tokenizer) –

    Tokenizer corresponding to model.

  • _device (str) –

    Device on which the model resides.

Initialize HFModel wrapper.

Parameters:

  • model (transformers model object) –

    Underlying model object.

  • tokenizer (transformers tokenizer) –

    Tokenizer corresponding to model.

Methods:

  • convert_input –

    Encode input text as token IDs for HuggingFace model.

  • generate –

    Generate response from model.

convert_input
convert_input(inputs, chat_template=False, system_prompt=None, **kwargs)

Encode input text as token IDs for HuggingFace model.

Parameters:

  • inputs (str or List[str] or List[List[str]]) –

    A single input text, a list of input texts, or a list of segmented texts.

  • chat_template (bool, default: False ) –

    Whether to apply chat template.

  • system_prompt (str or None, default: None ) –

    System prompt to include in chat template.

  • **kwargs (dict, default: {} ) –

    Additional keyword arguments for tokenizer.

Returns:

  • input_encoding ( BatchEncoding ) –

    Object produced by tokenizer.

generate
generate(inputs, chat_template=False, system_prompt=None, tokenizer_kwargs={}, text_only=True, **kwargs)

Generate response from model.

Parameters:

  • inputs (str or List[str] or List[List[str]]) –

    A single input text, a list of input texts, or a list of segmented texts.

  • chat_template (bool, default: False ) –

    Whether to apply chat template.

  • system_prompt (str or None, default: None ) –

    System prompt to include in chat template.

  • tokenizer_kwargs (dict, default: {} ) –

    Additional keyword arguments for tokenizer.

  • text_only (bool, default: True ) –

    Return only generated text (default) or an object containing additional outputs.

  • **kwargs (dict, default: {} ) –

    Additional keyword arguments for HuggingFace model.

Returns:

  • output_obj ( List[str] or GeneratedOutput ) –

    If text_only == True, a list of generated texts corresponding to inputs. If text_only == False, a GeneratedOutput object containing the following: output_ids: (num_inputs, output_token_count) torch.Tensor of generated token IDs. output_text: List of generated texts. output_token_count: Maximum number of generated tokens.

vllm

Wrapper for VLLM models.

Classes:

  • VLLMModel –

    Wrapper for VLLM models.

VLLMModel
VLLMModel(model, model_name, tokenizer=None)

Bases: Model

Wrapper for VLLM models.

Attributes:

  • _model (OpenAI model object) –

    Underlying model object.

  • _model_name (str) –

    Name of the model.

  • _tokenizer (transformers tokenizer or None) –

    HuggingFace tokenizer corresponding to the model (for applying chat template).

Initialize VLLMModel wrapper.

Parameters:

  • model (OpenAI model object) –

    Underlying model object.

  • model_name (str) –

    Name of the model.

  • tokenizer (transformers tokenizer or None, default: None ) –

    HuggingFace tokenizer corresponding to the model (for applying chat template).

Methods:

  • convert_input –

    Convert input(s) into a list of strings.

  • generate –

    Generate response from model.

convert_input
convert_input(inputs, chat_template=False, system_prompt=None, **kwargs)

Convert input(s) into a list of strings.

Parameters:

  • inputs (str or List[str] or List[List[str]]) –

    A single input text, a list of input texts, or a list of segmented texts.

  • chat_template (bool, default: False ) –

    Whether to apply chat template.

  • system_prompt (str or None, default: None ) –

    System prompt to include in chat template.

Returns:

  • inputs ( List[str] ) –

    Converted input(s) as a list of strings.

generate
generate(inputs, chat_template=False, system_prompt=None, text_only=True, **kwargs)

Generate response from model.

Parameters:

  • inputs (str or List[str] or List[List[str]]) –

    A single input text, a list of input texts, or a list of segmented texts.

  • chat_template (bool, default: False ) –

    Whether to apply chat template.

  • system_prompt (str or None, default: None ) –

    System prompt to include in chat template.

  • text_only (bool, default: True ) –

    Return only generated text (default) or an object containing additional outputs.

  • **kwargs (dict, default: {} ) –

    Additional keyword arguments for VLLM model.

Returns:

  • output_obj ( List[str] or GeneratedOutput ) –

    If text_only == True, a list of generated texts corresponding to inputs. If text_only == False, a GeneratedOutput object containing the following: output_text: List of generated texts.

scalarizers

Module containing scalarizers, which compute scalar output values based on the outputs or inputs of an LLM.

Modules:

  • bart_score –

    BARTScorer class used by icx360.utils.scalarizers.TextScalarizedModel.

  • base_scalarizer –

    Base class for scalarizers.

  • bleu_scalarizer –

    File containing class BleuScalarizer

  • contradiction_scalarizer –

    File containing class ContradictionScalarizer

  • nli_scalarizer –

    File containing class NLIScalarizer

  • preference_scalarizer –

    File containing class PreferenceScalarizer

  • prob –

    Scalarized model that computes the log probability of generating a reference output conditioned on inputs.

  • text_only –

    Scalarized model that computes similarity scores between generated texts and a reference output text.

bart_score

BARTScorer class used by icx360.utils.scalarizers.TextScalarizedModel.

This file (excluding this docstring) is an exact copy of the core source file from the BARTScore authors: https://github.com/neulab/BARTScore/blob/main/bart_score.py. It is licensed under the Apache License Version 2.0.

For more information, please refer to the BARTScore paper: BARTScore: Evaluating Generated Text as Text Generation. Weizhe Yuan, Graham Neubig, and Pengfei Liu. Advances in Neural Information Processing Systems (NeurIPS) 2021.

Classes:

BARTScorer
BARTScorer(device='cuda:0', max_length=1024, checkpoint='facebook/bart-large-cnn')

Methods:

Attributes:

device instance-attribute
device = device
loss_fct instance-attribute
loss_fct = NLLLoss(reduction='none', ignore_index=pad_token_id)
lsm instance-attribute
lsm = LogSoftmax(dim=1)
max_length instance-attribute
max_length = max_length
model instance-attribute
model = from_pretrained(checkpoint)
tokenizer instance-attribute
tokenizer = from_pretrained(checkpoint)
load
load(path=None)

Load model from paraphrase finetuning

multi_ref_score
multi_ref_score(srcs, tgts: List[List[str]], agg='mean', batch_size=4)
score
score(srcs, tgts, batch_size=4)

Score a batch of examples

test
test(batch_size=3)

Test

base_scalarizer

Base class for scalarizers.

Scalarizers compute real-valued scalar outputs for text inputs or outputs of LLMs, for example by comparing the inputs to a reference input or the corresponding outputs to a reference output.

Classes:

Scalarizer
Scalarizer(model=None)

Bases: ABC

Base class for scalarizers.

Attributes:

  • model (Model or None) –

    Generative model, wrapped in an icx360.utils.model_wrappers.Model object (optional, default None).

Initialize Scalarizer.

Parameters:

  • model (Model or None, default: None ) –

    Generative model, wrapped in an icx360.utils.model_wrappers.Model object (optional, default None).

Methods:

model instance-attribute
model = model
scalarize_output abstractmethod
scalarize_output(inputs=None, outputs=None, ref_input=None, ref_output=None, **kwargs)

Compute scalar outputs.

Parameters:

  • inputs (str or List[str] or List[List[str]] or None, default: None ) –

    Inputs to compute scalar outputs for: A single input text, a list of input texts, or a list of segmented texts.

  • outputs (str or List[str] or None, default: None ) –

    Outputs to scalarize (corresponding to inputs).

  • ref_input (str or None, default: None ) –

    Reference input used to scalarize.

  • ref_output (str or GeneratedOutput or None, default: None ) –

    Reference output (text or GeneratedOutput object) used to scalarize.

  • **kwargs (dict, default: {} ) –

    Additional keyword arguments.

Returns:

  • scalar_outputs ( (num_inputs,) torch.Tensor ) –

    Scalar output for each input.

bleu_scalarizer

File containing class BleuScalarizer

This class is used to scalarize text using a Bleu metric

Classes:

BleuScalarizer
BleuScalarizer(model_path='', device='cuda', experiment_id='id')

Bases: Scalarizer

BleuScalarizer object.

Instances of BleuScalarizer can call scalarize_output to produce scalarized version of input text accoring to BLEU score

Attributes _bleu: model for computing BLEU score _device: device on which to perform computations

Initialize bleu scalarizer object.

Parameters:

  • model_path (str, default: '' ) –

    placeholder. deprecated here.

  • device (str, default: 'cuda' ) –

    device on which to perform computations

  • experiment_id (str, default: 'id' ) –

    unique string for parallel scores to be computed without issue

Methods:

Attributes:

model instance-attribute
model = model
scalarize_output
scalarize_output(inputs, outputs, ref_input='', ref_output='', input_label=0, info=False)

Convert text input and outputs to numerical score

Use BLEU score to scalarize. Compute BLEU(outputs, ref_output) and BLEU(inputs, ref_input) and output a linear combination of BLEU scores

Parameters:

  • inputs (str) –

    input prompt

  • outputs (str) –

    response to input prompt

  • ref_input (str, default: '' ) –

    contrastive prompt

  • ref_output (str, default: '' ) –

    response to contrastive prompt

  • input_label (int, default: 0 ) –

    placeholder. not used here.

  • info (bool, default: False ) –

    placeholder. not used here.

Returns:

  • score ( float ) –

    scalarized output

  • label_contrast ( int ) –

    placeholder. not used here.

contradiction_scalarizer

File containing class ContradictionScalarizer

This class is used to scalarize text using a Contradiction metric via Natural Language Inference (NLI)

Classes:

ContradictionScalarizer
ContradictionScalarizer(model_path='cross-encoder/nli-deberta-v3-base', device='cuda')

Bases: Scalarizer

ContradictionScalarizer object.

Instances of ContradictionScalarizer can call scalarize_output to produce scalarized version of input text accoring to Contradiction score

Attributes _model: NLI model for computing contradiction score _tokenizer: tokenizer of NLI model _device: device on which to perform computations

Initialize contradiction scalarizer object.

Parameters:

  • model_path (str, default: 'cross-encoder/nli-deberta-v3-base' ) –

    NLI model for computing contradiction score

  • device (str, default: 'cuda' ) –

    device on which to perform computations

Methods:

Attributes:

model instance-attribute
model = model
predict_contradiction
predict_contradiction(inputs, outputs, ref_input='', ref_output='')

Convert text input and outputs to 0/1 classification

Use NLI contradiction score to scalarize. Compute if ref_output contradicts outputs and normalize by contradiction score of outputs to outputs.

Parameters:

  • inputs (str) –

    placeholder. not used here.

  • outputs (str) –

    response to input prompt

  • ref_input (str, default: '' ) –

    placeholder. not used here.

  • ref_output (str, default: '' ) –

    response to contrastive prompt

Returns:

  • ret ( int ) –

    if contradiction found return 1, else return 0.

scalarize_output
scalarize_output(inputs, outputs, ref_input='', ref_output='', input_label=0, info=False)

Convert text input and outputs to numerical score

Use NLI contradiction score to scalarize. Compute if ref_output contradicts outputs and normalize by contradiction score of outputs to outputs.

Parameters:

  • inputs (str) –

    placeholder. not used here.

  • outputs (str) –

    response to input prompt

  • ref_input (str, default: '' ) –

    placeholder. not used here.

  • ref_output (str, default: '' ) –

    response to contrastive prompt

  • input_label (int, default: 0 ) –

    placeholder. not used here.

  • info (bool, default: False ) –

    print extra information if True

Returns:

  • score ( float ) –

    scalarized output

  • label_contrast ( int ) –

    placeholder. not used here.

nli_scalarizer

File containing class NLIScalarizer

This class is used to scalarize text using an Natural Language Inference (NLI) score to measure the change in scores

Classes:

NLIScalarizer
NLIScalarizer(model_path='cross-encoder/nli-deberta-v3-base', device='cuda')

Bases: Scalarizer

NLIScalarizer object.

Instances of NLIScalarizer can call scalarize_output to produce scalarized version of input text accoring to change in NLI score

Attributes _model: NLI model for computing contradiction score _tokenizer: tokenizer of NLI model _device: device on which to perform computations

Initialize nli scalarizer object.

Parameters:

  • model_path (str, default: 'cross-encoder/nli-deberta-v3-base' ) –

    NLI model for computing nli score

  • device (str, default: 'cuda' ) –

    device on which to perform computations

Methods:

Attributes:

model instance-attribute
model = model
scalarize_output
scalarize_output(inputs, outputs, ref_input='', ref_output='', input_label=0, info=False)

Convert text input and outputs to numerical score

Use NLI score to scalarize. Compute score of NLI prediction of NLI(inputs, outputs) and compute change in score for that class of NLI(inputs, ref_output)

Parameters:

  • inputs (str) –

    input prompt

  • outputs (str) –

    response to input prompt

  • ref_input (str, default: '' ) –

    placeholder. not used here.

  • ref_output (str, default: '' ) –

    response to contrastive prompt

  • input_label (int, default: 0 ) –

    placeholder. not used here.

  • info (bool, default: False ) –

    print extra information if True

Returns:

  • score ( float ) –

    scalarized output

  • label_contrast ( int ) –

    placeholder. not used here.

preference_scalarizer

File containing class PreferenceScalarizer

This class is used to scalarize text using a preference model to measure the change in preference for a contrastive response to the original prompt

Classes:

PreferenceScalarizer
PreferenceScalarizer(model_path='stanfordnlp/SteamSHP-flan-t5-large', device='cuda')

Bases: Scalarizer

PreferenceScalarizer object.

Instances of PreferenceScalarizer can call scalarize_output to produce scalarized version of input text accoring to change in preference of a contrastive response relative to the initial response

Attributes _model: model for computing preference score _tokenizer: tokenizer of preference model _device: device on which to perform computations

Initialize preference scalarizer object.

Parameters:

  • model_path (str, default: 'stanfordnlp/SteamSHP-flan-t5-large' ) –

    preference model

  • device (str, default: 'cuda' ) –

    device on which to perform computations

Methods:

Attributes:

model instance-attribute
model = model
scalarize_output
scalarize_output(inputs, outputs, ref_input='', ref_output='', input_label=0, info=False)

Convert text input and outputs to numerical score

Use preference score to scalarize. Compute preference for prompt inputs relative to two different responses, outputs and ref_output.

Parameters:

  • inputs (str) –

    input prompt

  • outputs (str) –

    response to input prompt

  • ref_input (str, default: '' ) –

    placeholder. not used here.

  • ref_output (str, default: '' ) –

    response to contrastive prompt

  • input_label (int, default: 0 ) –

    placeholder. not used here.

  • info (bool, default: False ) –

    placeholder. not used here.

Returns:

  • score ( float ) –

    scalarized output

  • label_contrast ( int ) –

    placeholder. not used here.

prob

Scalarized model that computes the log probability of generating a reference output conditioned on inputs.

This "scalarized model" is a generative model that can also compute the log probability (or a transformation thereof) of generating a given reference output conditioned on inputs.

Classes:

  • ProbScalarizedModel –

    Generative model that also computes the probability of a given reference output conditioned on inputs.

ProbScalarizedModel
ProbScalarizedModel(model)

Bases: Scalarizer

Generative model that also computes the probability of a given reference output conditioned on inputs.

Attributes:

  • model (Model) –

    Generative model, wrapped in an icx360.utils.model_wrappers.Model object.

Initialize ProbScalarizedModel.

Parameters:

  • model (Model) –

    Generative model, wrapped in an icx360.utils.model_wrappers.Model object.

Raises:

  • TypeError –

    If the model is not an icx360.utils.model_wrappers.HFModel or an icx360.utils.model_wrappers.VLLMModel.

Methods:

  • scalarize_output –

    Compute probability of reference output conditioned on inputs.

model instance-attribute
model = model
scalarize_output
scalarize_output(inputs=None, outputs=None, ref_input=None, ref_output=None, chat_template=False, system_prompt=None, tokenizer_kwargs={}, transformation='log_prob_mean', **kwargs)

Compute probability of reference output conditioned on inputs.

Parameters:

  • inputs (str or List[str] or List[List[str]], default: None ) –

    Inputs to compute probabilities for: A single input text, a list of input texts, or a list of segmented texts.

  • outputs (str or List[str] or None, default: None ) –

    Outputs to scalarize (corresponding to inputs) - not used.

  • ref_input (str or None, default: None ) –

    Reference input used to scalarize - not used.

  • ref_output (GeneratedOutput, default: None ) –

    Reference output object.

  • chat_template (bool, default: False ) –

    Whether to apply chat template.

  • system_prompt (str or None, default: None ) –

    System prompt to include in chat template.

  • tokenizer_kwargs (dict, default: {} ) –

    Additional keyword arguments for tokenizer.

  • transformation (str, default: 'log_prob_mean' ) –

    Transformation to apply to token probabilities. "log_prob_mean": arithmetic mean of log probabilities (default). "log_prob_sum": sum of log probabilities. "prob_geo_mean": geometric mean of probabilities. "prob_prod": product of probabilities.

  • **kwargs (dict, default: {} ) –

    Additional keyword arguments for model.

Returns:

  • probs_transformed ( (num_inputs,) torch.Tensor ) –

    Transformed probability of generating the reference output conditioned on each input.

text_only

Scalarized model that computes similarity scores between generated texts and a reference output text.

This "scalarized model" is a generative model that can also compute similarity scores between the texts it generates and a reference output text.

Classes:

  • TextScalarizedModel –

    Generative model that also computes similarity scores between its generated texts and a reference text.

TextScalarizedModel
TextScalarizedModel(model=None, sim_scores=['nli_logit', 'bert', 'st', 'summ', 'bart'], model_nli=None, model_bert=None, model_st='all-MiniLM-L6-v2', model_summ=None, model_bart='facebook/bart-large-cnn', device=None)

Bases: Scalarizer

Generative model that also computes similarity scores between its generated texts and a reference text.

Attributes:

  • model (Model) –

    Generative model, wrapped in an icx360.utils.model_wrappers.Model object.

  • sim_scores (List[str]) –

    List of similarity scores to compute. "nli_logit"/"nli": Logit/probability of entailment label from natural language inference model. "bert": BERTScore. "st": Cosine similarity between SentenceTransformer embeddings. "summ": Generation probability of a summarization model (similar to BARTScore). "bart": BARTScore.

  • model_nli (AutoModelForSequenceClassification) –

    Natural language inference model.

  • tokenizer_nli (AutoTokenizer) –

    Tokenizer for natural language inference model.

  • idx_entail (int) –

    Index corresponding to entailment label.

  • bertscore (EvaluationModule) –

    BERTScore evaluation module.

  • model_bert (str) –

    Name of BERT-like model for computing BERTScore.

  • model_st (SentenceTransformer model) –

    SentenceTransformer embedding model.

  • model_summ (AutoModelForSeq2SeqLM) –

    Summarization model.

  • tokenizer_summ (AutoTokenizer) –

    Tokenizer for summarization model.

  • bart_scorer (BARTScorer) –

    Object for computing BARTScore.

  • device (device or str or None) –

    Device for the above models.

Initialize TextScalarizedModel.

Parameters:

  • model (Model, default: None ) –

    Generative model, wrapped in an icx360.utils.model_wrappers.Model object.

  • sim_scores (List[str], default: ['nli_logit', 'bert', 'st', 'summ', 'bart'] ) –

    List of similarity scores to compute. "nli_logit"/"nli": Logit/probability of entailment label from natural language inference model. "bert": BERTScore. "st": Cosine similarity between SentenceTransformer embeddings. "summ": Generation probability of a summarization model (similar to BARTScore). "bart": BARTScore.

  • model_nli (str, default: None ) –

    Name of natural language inference model.

  • model_bert (str, default: None ) –

    Name of BERT-like model for computing BERTScore.

  • model_st (str, default: 'all-MiniLM-L6-v2' ) –

    Name of SentenceTransformer embedding model.

  • model_summ (str, default: None ) –

    Name of summarization model.

  • model_bart (str, default: 'facebook/bart-large-cnn' ) –

    Name of BART-like model for computing BARTScore.

  • device (device or str or None, default: None ) –

    Device for the above models.

Methods:

  • scalarize_output –

    Compute similarity scores between generated texts and reference text.

bart_scorer instance-attribute
bart_scorer = BARTScorer(device=device, checkpoint=model_bart)
bertscore instance-attribute
bertscore = load('bertscore')
device instance-attribute
device = select_device() if device is None else device
idx_entail instance-attribute
idx_entail = label2id[key]
model instance-attribute
model = model
model_bert instance-attribute
model_bert = model_bert
model_nli instance-attribute
model_nli = to(device)
model_st instance-attribute
model_st = SentenceTransformer(model_st, device=device)
model_summ instance-attribute
model_summ = to(device)
sim_scores instance-attribute
sim_scores = sim_scores
tokenizer_nli instance-attribute
tokenizer_nli = from_pretrained(model_nli)
tokenizer_summ instance-attribute
tokenizer_summ = from_pretrained(model_summ)
scalarize_output
scalarize_output(inputs=None, outputs=None, ref_input=None, ref_output=None, max_new_tokens_factor=1.5, symmetric=True, idf=False, transformation='log_prob_mean', **kwargs)

Compute similarity scores between generated texts and reference text.

Parameters:

  • inputs (str or List[str] or List[List[str]] or None, default: None ) –

    Inputs to compute similarity scores for: A single input text, a list of input texts, or a list of segmented texts.

  • outputs (List[str] or None, default: None ) –

    Generated texts to compute similarity scores for. If None, then will be generated by calling self.model.generate().

  • ref_input (str or None, default: None ) –

    Reference input used to scalarize - not used.

  • ref_output (GeneratedOutput, default: None ) –

    Reference output object containing reference text (ref_output.output_text).

  • max_new_tokens_factor (float, default: 1.5 ) –

    Multiplicative factor for setting max_new_tokens for generation.

  • symmetric (bool, default: True ) –

    Make NLI entailment score symmetric (geometric mean of reference -> generated and generated -> reference).

  • idf (bool, default: False ) –

    Use idf weighting for BERTScore.

  • transformation (str, default: 'log_prob_mean' ) –

    Transformation to apply to output token probabilities of summarization model. "log_prob_mean": arithmetic mean of log probabilities (default). "log_prob_sum": sum of log probabilities. "prob_geo_mean": geometric mean of probabilities. "prob_prod": product of probabilities.

  • **kwargs (dict, default: {} ) –

    Additional keyword arguments for model.

Returns:

  • scores ( dict of (num_inputs,) torch.Tensor ) –

    For each label in self.sim_scores, a Tensor of corresponding similarity scores between generated texts and the reference text.

segmenters

Module containing utilities for segmenting input text into units.

Modules:

  • spacy –

    Class and functions for segmenting input text into units using a spaCy model.

  • utils –

    Other utilities for segmenting input text into units.

spacy

Class and functions for segmenting input text into units using a spaCy model.

SpaCySegmenter is the main class. The remaining functions implement an algorithm for segmentation into phrases.

Classes:

  • SpaCySegmenter –

    Class for segmenting input text into units using a spaCy model.

Functions:

SpaCySegmenter
SpaCySegmenter(spacy_model)

Class for segmenting input text into units using a spaCy model.

Attributes:

  • model (Language) –

    spaCy model.

Initialize SpaCySegmenter object.

Parameters:

  • spacy_model (str) –

    Name of spaCy model.

Methods:

  • segment_units –

    (Further) Segment input text into units.

model instance-attribute
model = load(spacy_model)
segment_units
segment_units(input_text, ind_segment=True, unit_types='s', sent_idxs=None, segment_type='w', max_phrase_length=10)

(Further) Segment input text into units.

Parameters:

  • input_text (str or list[str]) –

    Input text as a single unit (if str) or existing sequence of units (list[str]).

  • ind_segment (bool or list[bool], default: True ) –

    Whether to segment entire input text or each existing unit. If bool, applies to all units. If list[bool], applies to each unit individually.

  • unit_types (str or list[str], default: 's' ) –

    Types of units in input_text: "p" for paragraph, "s" for sentence, "w" for word, "n" for not to be perturbed or segmented (fixed). If str, applies to all units in input_text, otherwise unit-specific.

  • sent_idxs (list[int] or None, default: None ) –

    Index of sentence (or larger unit) that contains each existing unit.

  • segment_type (str, default: 'w' ) –

    Type of units to segment into: "s" for sentences, "w" for words, "ph" for phrases.

  • max_phrase_length (int, default: 10 ) –

    Maximum phrase length in terms of spaCy tokens.

Returns:

  • units ( list[str] ) –

    Resulting sequence of units.

  • unit_types ( list[str] ) –

    Types of units.

  • sent_idxs_new ( list[int] ) –

    Index of sentence (or larger unit) that contains each unit.

append_or_segment_children
append_or_segment_children(children, phrases, phrase_types, doc, max_phrase_length=10)

Append syntactic children of a node as phrases or further segment them.

Parameters:

  • children (generator[Token]) –

    Generator of syntactic children.

  • phrases (list[Span]) –

    List of current phrases.

  • phrase_types (list[str]) –

    List of current phrase types.

  • doc (Doc) –

    spaCy Doc containing the sentence.

  • max_phrase_length (int, default: 10 ) –

    Maximum phrase length in terms of spaCy tokens.

Returns:

  • phrases ( list[Span] ) –

    Updated list of phrases.

  • phrase_types ( list[str] ) –

    Updated list of phrase types.

  • need_sort ( bool ) –

    Flag to indicate whether phrases need sorting.

append_or_segment_span
append_or_segment_span(span, phrases, phrase_types, doc, max_phrase_length=10)

Append span to list of phrases or further segment span.

Parameters:

  • span (Span) –

    Span to be appended or further segmented.

  • phrases (list[Span]) –

    List of current phrases.

  • phrase_types (list[str]) –

    List of current phrase types.

  • doc (Doc) –

    spaCy Doc containing the sentence.

  • max_phrase_length (int, default: 10 ) –

    Maximum phrase length in terms of spaCy tokens.

Returns:

  • phrases ( list[Span] ) –

    Updated list of phrases.

  • phrase_types ( list[str] ) –

    Updated list of phrase types.

is_not_punct_space
is_not_punct_space(span)

Checks whether each token of span is not punctuation and not a space

Returns:

  • –

    A list of Booleans where each element is True iff corresponding token is not punctuation and not a space.

merge_nbor_of_singleton_phrase
merge_nbor_of_singleton_phrase(nbor, singleton, offset, max_nbor_length)

Decide whether to merge neighbor of singleton (single-token) phrase.

Evaluates conditions to determine if a neighboring phrase should be merged with a singleton phrase.

Parameters:

  • nbor (Span) –

    Neighboring phrase.

  • singleton (Span) –

    Singleton phrase.

  • offset (int) –

    Absolute difference between indices of neighboring and singleton phrases.

  • max_nbor_length (int) –

    Maximum neighbor length for merging in terms of spaCy tokens.

Returns:

  • ret ( bool ) –

    Whether to merge neighbor.

merge_noun_chunk_phrases
merge_noun_chunk_phrases(phrases, phrase_types, noun_chunks, doc)

Merge phrases that constitute a noun chunk.

Parameters:

  • phrases (list[Span]) –

    List of phrases.

  • phrase_types (list[str]) –

    List of phrase types.

  • noun_chunks (generator[Span]) –

    Generator of noun chunks.

  • doc (Doc) –

    spaCy Doc containing the sentence.

Returns:

  • phrases_merged ( list[Span] ) –

    List of merged phrases.

  • phrase_types_merged ( list[str] ) –

    Types of merged phrases.

merge_phrase_spans
merge_phrase_spans(phrases, phrase_types, spans_merge, doc)

Merge phrases within specified spans of phrases.

Parameters:

  • phrases (list[Span]) –

    List of phrases.

  • phrase_types (list[str]) –

    List of phrase types.

  • spans_merge (list[tuple]) –

    List of phrase spans, each a 2-element tuple of a starting phrase index and an ending phrase index.

  • doc (Doc) –

    spaCy Doc containing the sentence.

Returns:

  • phrases_merged ( list[Span] ) –

    List of merged phrases.

  • phrase_types_merged ( list[str] ) –

    Types of merged phrases.

merge_singleton_phrases
merge_singleton_phrases(phrases, phrase_types, doc, max_phrase_length=10)

Merge single-token phrases with their neighbors.

Parameters:

  • phrases (list[Span]) –

    List of phrases.

  • phrase_types (list[str]) –

    List of phrase types.

  • doc (Doc) –

    spaCy Doc containing the sentence.

  • max_phrase_length (int, default: 10 ) –

    Maximum phrase length in terms of spaCy tokens.

Returns:

  • phrases_merged ( list[Span] ) –

    List of merged phrases.

  • phrase_types_merged ( list[str] ) –

    Types of merged phrases.

segment_into_phrases
segment_into_phrases(sent, doc, max_phrase_length=10)

Segment sentence (or span within sentence) into phrases.

Parameters:

  • sent (Span) –

    Sentence or span to be segmented.

  • doc (Doc) –

    spaCy Doc containing the sentence.

  • max_phrase_length (int, default: 10 ) –

    Maximum phrase length in terms of spaCy tokens.

Returns:

  • phrases ( list[Span] ) –

    List of segmented phrases.

  • phrase_types ( list[str] ) –

    Types of phrases (e.g., "ROOT", "non-leaf", spaCy dependency labels).

sort_phrases
sort_phrases(phrases, phrase_types)

Sort phrases by their starting token index.

Parameters:

  • phrases (list[Span]) –

    List of phrases.

  • phrase_types (list[str]) –

    List of phrase types.

Returns:

  • phrases ( list[Span] ) –

    Sorted list of phrases.

  • phrase_types ( list[str] ) –

    Types of sorted phrases.

utils

Other utilities for segmenting input text into units.

Functions:

exclude_non_alphanumeric
exclude_non_alphanumeric(unit_types, units)

Exclude units without alphanumeric characters.

Modifies the unit_types list by setting the type of units without alphanumeric characters to "n".

Parameters:

  • unit_types (list[str]) –

    Types of units.

  • units (list[str]) –

    Sequence of units.

Returns:

  • unit_types ( list[str] ) –

    Updated types of units.

subset_utils

Utilities that deal with subsets of input units.

These utilities are used by MExGen C-LIME (icx360.algorithms.mexgen.clime) and L-SHAP (icx360.algorithms.mexgen.lshap).

Functions:

  • mask_subsets –

    Mask subsets of units with a fixed replacement string.

  • sample_subsets –

    Sample subsets of input units that can be replaced.

mask_subsets
mask_subsets(units, subsets_replace, replacement_str)

Mask subsets of units with a fixed replacement string.

Parameters:

  • units (List[str]) –

    Original sequence of units.

  • subsets_replace (List[List[int]]) –

    A list of subsets to replace, where each subset is a list of unit indices.

  • replacement_str (str) –

    String to replace units with (default "" for dropping units).

Returns:

  • input_masked ( List[List[str]] ) –

    A list of masked versions of units, where each masked version corresponds to a subset in subsets_replace.

sample_subsets
sample_subsets(idx_replace, max_units_replace, oversampling_factor=None, num_return_sequences=None, empty_subset=False, return_weights=False)

Sample subsets of input units that can be replaced.

Parameters:

  • idx_replace (num_replace,) np.ndarray) –

    Indices of units that can be replaced.

  • max_units_replace (int) –

    Maximum number of units to replace at one time.

  • oversampling_factor (float or None, default: None ) –

    Ratio of number of perturbed inputs to be generated to number of units that can be replaced. Default None means no upper bound on this ratio.

  • num_return_sequences (int or None, default: None ) –

    Number of perturbed inputs to generate for each subset of units to replace.

  • empty_subset (bool, default: False ) –

    Whether to include the empty subset.

  • return_weights (bool, default: False ) –

    Whether to return weights associated with subsets.

Returns:

  • subsets ( list[list[int]] ) –

    A list of subsets, where each subset is a list of unit indices.

  • weights ( list[float] ) –

    Weights associated with subsets, only returned if return_weights==True.

toma

Model inference utilities that use the toma package to avoid running out of CUDA memory.

Functions:

  • toma_call –

    Call model using the toma package to adapt to CUDA memory constraints.

  • toma_generate –

    Generate outputs using the toma package to adapt to CUDA memory constraints.

  • toma_get_probs –

    Compute log probabilities of tokens in a given reference output using the toma package to adapt to CUDA memory.

toma_call
toma_call(start, end, model, input_dict, logits, output_hidden_states=False, hidden_states=None)

Call model using the toma package to adapt to CUDA memory constraints.

This function passes a batch of inputs to a transformers classification model. It produces logits and, optionally, hidden states, and stores them in pre-allocated Tensors.

Parameters:

  • start (int) –

    Index of the first input in the batch.

  • end (int) –

    Index of the last input in the batch.

  • model (transformers model) –

    Classification model.

  • input_dict (dict - like) –

    Dict-like object produced by a HuggingFace tokenizer, containing input data.

  • logits (num_inputs, num_labels) torch.Tensor) –

    Pre-allocated Tensor to store logits.

  • output_hidden_states (bool, default: False ) –

    Whether to also output model's hidden states/representations.

  • hidden_states (tuple(Tensor) or None, default: None ) –

    If output_hidden_states == True, then for each layer of the model, a pre-allocated (num_inputs, input_length, hidden_dim) Tensor of hidden states/representations, to be populated by toma_call. Otherwise, None.

Returns:

  • –

    None.

This function modifies the provided logits Tensor in-place with predicted logits and, if requested, the hidden_states tuple with corresponding hidden states.

toma_generate
toma_generate(start, end, model, input_dict, output_ids, output_hidden_states=False, hidden_states=None, **kwargs)

Generate outputs using the toma package to adapt to CUDA memory constraints.

This function passes a batch of inputs to a transformers generative model. It generates token IDs and, optionally, hidden states, and stores them in pre-allocated Tensors.

Parameters:

  • start (int) –

    Index of the first input in the batch.

  • end (int) –

    Index of the last input in the batch.

  • model (transformers model) –

    Generative model.

  • input_dict (dict - like) –

    Dict-like object produced by a HuggingFace tokenizer, containing input data.

  • output_ids (num_inputs, gen_start + max_new_tokens) torch.Tensor) –

    Pre-allocated Tensor to store generated token IDs.

  • output_hidden_states (bool, default: False ) –

    Whether to also output model's hidden states/representations.

  • hidden_states (tuple(Tensor) or None, default: None ) –

    If output_hidden_states == True, then for each layer of the encoder, a pre-allocated (num_inputs, input_length, hidden_dim) Tensor of hidden states/representations, to be populated by toma_generate. Otherwise, None.

  • **kwargs (dict, default: {} ) –

    Additional keyword arguments for the HuggingFace model.

Returns:

  • –

    None.

This function modifies the provided output_ids Tensor in-place with generated token IDs and, if requested, the hidden_states tuple with corresponding hidden states.

toma_get_probs
toma_get_probs(start, end, model, input_dict, ref_output, log_probs, output_hidden_states=False, hidden_states=None)

Compute log probabilities of tokens in a given reference output using the toma package to adapt to CUDA memory.

This function passes a batch of inputs to a transformers generative model. It computes log probabilities of reference output tokens conditioned on these outputs and, optionally, hidden states, and stores them in pre-allocated Tensors.

Parameters:

  • start (int) –

    Index of the first input in the batch.

  • end (int) –

    Index of the last input in the batch.

  • model (transformers model) –

    Generative model.

  • input_dict (dict - like) –

    Dict-like object produced by a HuggingFace tokenizer, containing input data.

  • ref_output (1, num_output_tokens) torch.Tensor) –

    Token IDs of reference output to compute log probabilities for.

  • log_probs (num_inputs, gen_length) torch.Tensor) –

    Pre-allocated Tensor to store log probabilities.

  • output_hidden_states (bool, default: False ) –

    Whether to also output model's hidden states/representations.

  • hidden_states (tuple(Tensor) or None, default: None ) –

    If output_hidden_states == True, then for each layer of the model, a pre-allocated (num_inputs, input_length, hidden_dim) Tensor of hidden states/representations, to be populated by toma_get_probs. Otherwise, None.

Returns:

  • –

    None.

This function modifies the provided log_probs Tensor in-place with predicted log probabilities and, if requested, the hidden_states tuple with corresponding hidden states.