API reference

This is an automatically generated API reference of the ICX360 toolkit.

icx360

Modules:

algorithms –

Module containing submodules for MExGen, CELL, and Token Highlighter explainers
metrics –

Module containing metrics for explanations
utils –

Module containing various utilities including model wrappers, infillers, scalarizers, and segmenters, among others

algorithms

Module containing submodules for MExGen, CELL, and Token Highlighter explainers

Modules:

cell –

Module containing CELL and mCELL submodules
lbbe –

File containing base class for local black box explainers
lwbe –

File containing base class for local white box explainers
mexgen –

Module containing submodules for MExGen C-LIME and MExGen L-SHAP explainers
token_highlighter –

Module containing TokenHighlighter submodule (thllm)

cell

Module containing CELL and mCELL submodules

Modules:

CELL –

File containing class CELL
mCELL –

File containing class mCELL

CELL

File containing class CELL

CELL is an explainer class that contains function explain_instance that provides contrastive explanations of input instances. The algorithm for providing explanations is described as CELL in: CELL your Model: Contrastive Explanations for Large Language Models, Ronny Luss, Erik Miehling, Amit Dhurandhar. https://arxiv.org/abs/2406.11785

Classes:

CELL –

Instances of CELL contain information about the LLM model being explained.

CELL

CELL(model, infiller='bart', num_return_sequences=1, scalarizer='shp', scalarizer_model_path=None, scalarizer_type='distance', generation=True, experiment_id='id', device=None)

Bases: LocalBBExplainer

Instances of CELL contain information about the LLM model being explained. These instances are used to explain LLM responses on input text using a budgeted algorithm with intelligent search strategy.

Attributes:

_model –

model that we want to explain (based on icx360/utils/model_wrappers)
_infiller –

string for function used to input text with a mask token and output text with mask replaced by text
_num_return_sequences –

integer number of sequences returned when doing generation for mask infilling
_scalarizer_name –

string of scalarizer to use to determine if a contrast is found (must be from ['shp', 'nli', 'bleu']
_scalarizer_type –

string specifying either 'distance' for explaining LLM generation using distances or 'classifier' for explaining a classifier
_scalarizer_func –

function used to do scalarization from icx360/utils/scalarizers
_generation –

boolean specifying whether the model being explained performs true generation (as opposed to having output==input for classification)
_device –

string detailing device on which to perform all operations (must be from ['cpu', 'cuda', 'mps']). should be same as model being explained

Initialize contrastive explainer.

Parameters:

model –

model that we want to explain (based on icx360/utils/model_wrappers)
infiller (str, default: 'bart' ) –

selects function used to input text with a mask token and output text with mask replaced by text
num_return_sequences (int, default: 1 ) –

number of sequences returned when doing generation for mask infilling
scalarizer (str, default: 'shp' ) –

select which scalarizer to use to determine if a contrast is found (must be from ['shp', 'nli', 'bleu'])
scalarizer_model_path (str, default: None ) –

allow user to pass a model path for scalarizers (e.g., choose 'stanfordnlp/SteamSHP-flan-t5-xl' instead of default 'stanfordnlp/SteamSHP-flan-t5-large')
scalarizer_type (str, default: 'distance' ) –

'distance' for explaining LLM generation using distances, 'classifier' for explaining a classifier
generation (bool, default: True ) –

the model being explained performs true generation (as opposed to having output==input)
experiment_id (str, default: 'id' ) –

passed to evaluate.load for certain scalarizers. This is used if several distributed evaluations share the same file system.
device (str, default: None ) –

device on which to perform all operations (must be from ['cpu', 'cuda', 'mps']). should be same as model being explained

Methods:

explain_instance –

Provide explanations of LLM applied to prompt input_text.
sample –

Generate sample prompts based on an input prompt
set_params –

Set parameters for the explainer.
splitTextByK –

Split text into words.

explain_instance

explain_instance(input_text, epsilon_contrastive=0.5, split_k=1, budget=100, radius=5, alpha=0.5, info=True, ir=False, input_text_list=[''], prompt_format='Context: $$input0$$ \n\nQuestion: $$input1$$ \n\nAnswer: ', multiple_inputs=False, input_inds_modify=[0], model_params={})

Provide explanations of LLM applied to prompt input_text.

Provide a contrastive explanation by changing prompt input_text such that the new prompt generates a response that is preferred as a response to input_text much less by a certain amount. This metric can be changed based on user needs.

Parameters:

input_text (str) –

input prompt to model that we want to explain
epsilon_contrastive (float, default: 0.5 ) –

amount of change in response to deem a contrastive explanation
split_k (int, default: 1 ) –

number of words to be split into each token that is masked together
budget (int, default: 100 ) –

maximum number of queries allowed from infilling model
radius (int, default: 5 ) –

radius for sampling near a previously modified token
alpha (float, default: 0.5 ) –

tradeoff between exploration and exploitation. lower alpha mean more exploration, higher alpha means more exploitation
info (bool, default: True ) –

True if to print output information, False otherwise
ir (bool, default: False ) –

True if to do input reduction, i.e., remove tokens that cause minimal change to response until a large change occurs
input_text_list (str list, default: [''] ) –

if multiple_inputs==True, then use input_text_list to feed additional text segments
prompt_format (str, default: 'Context: $$input0$$ \n\nQuestion: $$input1$$ \n\nAnswer: ' ) –

format for prompt to create from input_text and input_text_list. Default is question/answering for google/flan-t5-large
multiple_inputs (bool, default: False ) –

True if example requires multiple inputs and a format, i.e., uses input_text and input_text_list, False if just input_text for prompt
input_inds_modify (int list, default: [0] ) –

list of which input_text segments to modify for contrastive example when multiple_inputs==True
model_params (dico, default: {} ) –

additional keyword arguments for model generation (self._model.generate())

Returns:

result ( dico ) –

contains various pieces of contrastive explanation including contrastive prompt, response to the contrastive prompt, response to the input prompt, and which words were modified

sample

sample(input_sample, curr_position, radius, num_samples, model_params={})

Generate sample prompts based on an input prompt

Parameters:

input_sample (dico) –

contains information about a prompt including text and how it differs from the input prompt to the explainer
curr_position (int) –

position of tokens from which to generate samples within a radius of
radius (int) –

radius for sampling near a previously modified token
num_samples (int) –

number of samples to generate
model_params (dico, default: {} ) –

additional keyword arguments for model generation (self._model.generate())

Returns:

samples_list ( dico list ) –

list of samples which are dictionaries with same information as input_sample

set_params

set_params(*argv, **kwargs)

Set parameters for the explainer.

splitTextByK

splitTextByK(str, k)

Split text into words.

Parameters:

str (str) –

string to be split
k (int) –

number of consecutive words to keep together

Returns:

grouped_words ( str list ) –

list of words which when concatenated retrieves the input str

mCELL

File containing class mCELL

mCELL is an explainer class that contains function explain_instance that provides contrastive explanations of input instances. The algorithm for providing explanations is described as m-Cell in: CELL your Model: Contrastive Explanations for Large Language Models, Ronny Luss, Erik Miehling, Amit Dhurandhar. https://arxiv.org/abs/2406.11785

Classes:

mCELL –

mCELL Explainer object.

mCELL

mCELL(model, infiller='bart', num_return_sequences=1, scalarizer='shp', scalarizer_model_path=None, scalarizer_type='distance', generation=True, experiment_id='id', device=None)

Bases: LocalBBExplainer

mCELL Explainer object.

Instances of mCELL contain information about the LLM model being explained. These instances are used to explain LLM responses on input text using a myopic algorithm.

Attributes:

_model –

model that we want to explain (based on icx360/utils/model_wrappers)
_infiller –

string for function used to input text with a mask token and output text with mask replaced by text
_num_return_sequences –

integer number of sequences returned when doing generation for mask infilling
_scalarizer_name –

string of scalarizer to use to determine if a contrast is found (must be from ['shp', 'nli', 'bleu']
_scalarizer_type –

string specifying either 'distance' for explaining LLM generation using distances or 'classifier' for explaining a classifier
_scalarizer_func –

function used to do scalarization from icx360/utils/scalarizers
_generation –

boolean specifying whether the model being explained performs true generation (as opposed to having output==input for classification)
_device –

string detailing device on which to perform all operations (must be from ['cpu', 'cuda', 'mps']). should be same as model being explained

Initialize contrastive explainer.

Parameters:

model –

model that we want to explain (based on icx360/utils/model_wrappers)
infiller (str, default: 'bart' ) –

selects function used to input text with a mask token and output text with mask replaced by text
num_return_sequences (int, default: 1 ) –

number of sequences returned when doing generation for mask infilling
scalarizer (str, default: 'shp' ) –

select which scalarizer to use to determine if a contrast is found (must be from ['shp', 'nli', 'bleu', 'implicit_hate', 'stigma'])
scalarizer_model_path (str, default: None ) –

allow user to pass a model path for
scalarizer_type (str, default: 'distance' ) –

'distance' for explaining LLM generation using distances, 'classifier' for explaining a classifier
generation (bool, default: True ) –

the model being explained performs true generation (as opposed to having output==input)
experiment_id (str, default: 'id' ) –

passed to evaluate.load for certain scalarizers. This is used if several distributed evaluations share the same file system.
device (str, default: None ) –

device on which to perform all operations (must be from ['cpu', 'cuda', 'mps']). should be same as model being explained

Methods:

explain_instance –

Provide explanations of large language model applied to prompt input_text
set_params –

Set parameters for the explainer.
splitTextByK –

Split text into words.

explain_instance

explain_instance(input_text, epsilon_contrastive=0.5, epsilon_iter=0.001, split_k=1, no_change_max_iters=3, info=True, ir=False, model_params={})

Provide explanations of large language model applied to prompt input_text

Provide a contrastive explanation by changing prompt input_text such that the new prompt generates a response that is preferred as a response to input_text much less by a certain amount. This metric can be changed based on user needs.

Parameters:

input_text (str) –

input prompt to model that we want to explain
epsilon_contrastive (float, default: 0.5 ) –

amount of change in response to deem a contrastive explanation
epsilon_iter (float, default: 0.001 ) –

minimum amount of change between iterations to continue search
split_k (int, default: 1 ) –

number of words to be split into each token that is masked together
info (boolean, default: True ) –

True if to print output information, False otherwise
ir (boolean, default: False ) –

True if to do input reduction, i.e., remove tokens that cause minimal change to response until a large change occurs

Returns:

result ( dico ) –

contains various pieces of contrastive explanation including contrastive prompt, response to the contrastive prompt, response to the input prompt, and which words were modified

set_params

set_params(*argv, **kwargs)

Set parameters for the explainer.

splitTextByK

splitTextByK(str, k)

Split text into words.

Parameters:

str (str) –

string to be split
k (int) –

number of consecutive words to keep together

Returns:

grouped_words ( str list ) –

list of words which when concatenated retrieves the input str

lbbe

File containing base class for local black box explainers

Attributes:

ABC –

Ensure compatibility of Abstract Base Class with Python versions

Classes:

LocalBBExplainer –

LocalBBExplainer is the base class for local post-hoc black-box explainers (LBBE).

ABC `module-attribute`

ABC = ABC

LocalBBExplainer

LocalBBExplainer(*argv, **kwargs)

Bases: ABC

LocalBBExplainer is the base class for local post-hoc black-box explainers (LBBE). Such explainers are model agnostic and generally require access to model's predict function alone. Examples include LIME[#1], SHAP[#2], etc..

References

.. [#1] “Why Should I Trust You?” Explaining the Predictions of Any Classifier, ACM SIGKDD 2016. Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin. https://arxiv.org/abs/1602.04938. .. [#2] A Unified Approach to Interpreting Model Predictions, NIPS 2017. Lundberg, Scott M and Lee, Su-In. https://arxiv.org/abs/1705.07874

Initialize a LocalBBExplainer object.

Methods:

explain_instance –

Explain an input instance x.
set_params –

Set parameters for the explainer.

explain_instance `abstractmethod`

explain_instance(*argv, **kwargs)

Explain an input instance x.

set_params

set_params(*argv, **kwargs)

Set parameters for the explainer.

lwbe

File containing base class for local white box explainers

Attributes:

ABC –

Ensure compatibility of Abstract Base Class with Python versions

Classes:

LocalWBExplainer –

LocalWBExplainer is the base class for local post-hoc white box explainers (LWBE).

ABC `module-attribute`

ABC = ABC

LocalWBExplainer

LocalWBExplainer(*argv, **kwargs)

Bases: ABC

LocalWBExplainer is the base class for local post-hoc white box explainers (LWBE). Such explainers generally require access to model's internals beyond its predict function. Examples include Contrastive explanation method[#1], Layer-wise Relevance Propagation[#2], etc.

References

.. [#] Explanations based on the Missing: Towards Contrastive Explanations with Pertinent Negatives, NIPS, 2018. Amit Dhurandhar, Pin-Yu Chen, Ronny Luss, Chun-Chen Tu, Paishun Ting, Karthikeyan Shanmugam, Payel Das. https://arxiv.org/abs/1802.07623 .. [#2] http://www.heatmapping.org/

Constructor method, initialize a LocalWBExplainer object.

Methods:

explain_instance –

Explain an input instance x.
set_params –

Set parameters for the explainer.

explain_instance `abstractmethod`

explain_instance(*argv, **kwargs)

Explain an input instance x.

set_params

set_params(*argv, **kwargs)

Set parameters for the explainer.

mexgen

Module containing submodules for MExGen C-LIME and MExGen L-SHAP explainers

Modules:

clime –

Class and supporting functions for MExGen C-LIME explainer.
lshap –

Class and supporting functions for MExGen L-SHAP explainer.

clime

Class and supporting functions for MExGen C-LIME explainer.

The MExGen framework and C-LIME algorithm are described in

Multi-Level Explanations for Generative Language Models. Lucas Monteiro Paes and Dennis Wei et al. The 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025). https://arxiv.org/abs/2403.14459

Classes:

CLIME –

MExGen C-LIME explainer

Functions:

compute_linear_model_features –

Compute features used by explanatory linear model.
fit_linear_model –

Fit explanatory linear model.

CLIME

CLIME(model, segmenter='en_core_web_trf', scalarizer='prob', **kwargs)

Bases: LocalBBExplainer

MExGen C-LIME explainer

Attributes:

model (Model) –

Model to explain, wrapped in an icx360.utils.model_wrappers.Model object.
segmenter (SpaCySegmenter) –

Object for segmenting input text into units using a spaCy model.
scalarized_model (Scalarizer) –

"Scalarized model" that further wraps model with a method for computing scalar values based on the model's inputs or outputs.

Initialize MExGen C-LIME explainer.

Parameters:

model (Model) –

Model to explain, wrapped in an icx360.utils.model_wrappers.Model object.
segmenter (str, default: 'en_core_web_trf' ) –

Name of spaCy model to use in segmenter (icx360.utils.segmenters.SpaCySegmenter).
scalarizer (str, default: 'prob' ) –

Type of scalarizer to use. "prob": probability of generating original output conditioned on perturbed inputs (instantiates an icx360.utils.scalarizers.ProbScalarizedModel). "text": similarity scores between original output and perturbed outputs (instantiates an icx360.utils.scalarizers.TextScalarizedModel).
**kwargs (dict, default: {} ) –

Additional keyword arguments for initializing scalarizer.

Raises:

ValueError –

If scalarizer is not "prob" or "text".

Methods:

explain_instance –

Explain model output by attributing it to parts of the input text.
set_params –

Set parameters for the explainer.

model instance-attribute

model = model

scalarized_model instance-attribute

scalarized_model = ProbScalarizedModel(model)

segmenter instance-attribute

segmenter = SpaCySegmenter(segmenter)

explain_instance

explain_instance(input_orig, unit_types='p', ind_segment=True, segment_type='s', max_phrase_length=10, model_params={}, scalarize_params={}, oversampling_factor=10, max_units_replace=2, empty_subset=True, replacement_str='', num_nonzeros=None, debias=True)

Explain model output by attributing it to parts of the input text.

Uses an algorithm called C-LIME (a variant of LIME) to fit a local linear approximation to the model and compute attribution scores.

Parameters:

input_orig (str or List[str]) –

[input] Input text as a single unit (if str) or segmented sequence of units (List[str]).
unit_types (str or List[str], default: 'p' ) –

[input] Types of units in input_orig. "p" for paragraph, "s" for sentence, "w" for word, "n" for not to be perturbed/attributed to. If str, applies to all units in input_orig, otherwise unit-specific.
ind_segment (bool or List[bool], default: True ) –

[segmentation] Whether to segment input text. If bool, applies to all units; if List[bool], applies to each unit individually.
segment_type (str, default: 's' ) –

[segmentation] Type of units to segment into: "s" for sentences, "w" for words, "ph" for phrases.
max_phrase_length (int, default: 10 ) –

[segmentation] Maximum phrase length in terms of spaCy tokens (default 10).
model_params (dict, default: {} ) –

Additional keyword arguments for model generation (for the self.model.generate() method).
scalarize_params (dict, default: {} ) –

Additional keyword arguments for computing scalar outputs (for the self.scalarized_model.scalarize_output() method).
oversampling_factor (float, default: 10 ) –

[perturbation] Ratio of number of perturbed inputs to be generated to number of units that can be perturbed.
max_units_replace (int, default: 2 ) –

[perturbation] Maximum number of units to perturb at one time (default 2).
empty_subset (bool, default: True ) –

[perturbation] Whether to include empty subset of units to perturb (default True).
replacement_str (str, default: '' ) –

[perturbation] String to replace units with (default "" for dropping units).
num_nonzeros (int or None, default: None ) –

[linear model] Number of non-zero coefficients in linear model (default None means dense model).
debias (bool, default: True ) –

[linear model] Refit linear model with no penalty after selecting features (default True).

Returns:

output_dict ( dict ) –

Dictionary with the following items: "attributions" (dict): Dictionary with attribution scores, corresponding input units, and unit types. "output_orig" (icx360.utils.model_wrappers.GeneratedOutput): Output object generated from original input. "intercept" (float or dict[float]): Intercept(s) of linear model.

Items in "attributions" dictionary: "units" (List[str]): input_orig segmented into units if not already, otherwise same as original. "unit_types" (List[str]): Types of units. score_label ((num_units,) np.ndarray): One or more sets of attribution scores (labelled by the type of scalarizer).

set_params

set_params(*argv, **kwargs)

Set parameters for the explainer.

compute_linear_model_features

compute_linear_model_features(subsets_replace, num_units)

Compute features used by explanatory linear model.

This function generates a feature matrix for a linear model that explains the impact of perturbing specific input units.

Parameters:

subsets_replace (List[List[int]]) –

A list of subsets, where each subset is a list of indices corresponding to the units that have been replaced.
num_units (int) –

Total number of units.

Returns:

features ( (num_perturb, num_units) np.ndarray ) –

Matrix of feature values, equal to 1 if the unit is part of the perturbed subset, and 0 otherwise.

fit_linear_model

fit_linear_model(features, target, sample_weights, num_nonzeros, debias)

Fit explanatory linear model.

Parameters:

features (num_perturb, num_units) np.ndarray) –

Feature values.
target (num_perturb,) np.ndarray) –

Target values to predict.
sample_weights (num_perturb,) np.ndarray) –

Sample weights.
num_nonzeros (int or None) –

Number of non-zero coefficients desired in linear model, None means dense model.
debias (bool) –

Refit linear model with no penalty after selecting features.

Returns:

coef ( (num_units,) np.ndarray ) –

Coefficients of linear model.
intercept ( float ) –

Intercept of linear model.
num_nonzeros ( int ) –

Actual number of non-zero coefficients.

lshap

Class and supporting functions for MExGen L-SHAP explainer.

The MExGen framework and L-SHAP algorithm are described in

Multi-Level Explanations for Generative Language Models. Lucas Monteiro Paes and Dennis Wei et al. The 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025). https://arxiv.org/abs/2403.14459

Classes:

LSHAP –

MExGen L-SHAP explainer

Functions:

adapt_replacement_set –

Adapt set of units that can be replaced to the unit of interest.
get_normalization_constants –

Computes normalization constants for Shapley value calculation.

LSHAP

LSHAP(model, segmenter='en_core_web_trf', scalarizer='prob', **kwargs)

Bases: LocalBBExplainer

MExGen L-SHAP explainer

Attributes:

model (Model) –

Model to explain, wrapped in an icx360.utils.model_wrappers.Model object.
segmenter (SpaCySegmenter) –

Object for segmenting input text into units using a spaCy model.
scalarized_model (Scalarizer) –

"Scalarized model" that further wraps model with a method for computing scalar values based on the model's inputs or outputs.

Initialize MExGen L-SHAP explainer.

Parameters:

model (Model) –

Model to explain, wrapped in an icx360.utils.model_wrappers.Model object.
segmenter (str, default: 'en_core_web_trf' ) –

Name of spaCy model to use in segmenter (icx360.utils.segmenters.SpaCySegmenter).
scalarizer (str, default: 'prob' ) –

Type of scalarizer to use. "prob": probability of generating original output conditioned on perturbed inputs (instantiates an icx360.utils.scalarizers.ProbScalarizedModel). "text": similarity scores between original output and perturbed outputs (instantiates an icx360.utils.scalarizers.TextScalarizedModel).
**kwargs (dict, default: {} ) –

Additional keyword arguments for initializing scalarizer.

Raises:

ValueError –

If scalarizer is not "prob" or "text".

Methods:

explain_instance –

Explain model output by attributing it to parts of the input text.
set_params –

Set parameters for the explainer.

model instance-attribute

model = model

scalarized_model instance-attribute

scalarized_model = ProbScalarizedModel(model)

segmenter instance-attribute

segmenter = SpaCySegmenter(segmenter)

explain_instance

explain_instance(input_orig, unit_types='p', ind_interest=None, ind_segment=True, segment_type='s', max_phrase_length=10, model_params={}, scalarize_params={}, num_neighbors=2, max_units_replace=2, replacement_str='')

Explain model output by attributing it to parts of the input text.

Uses an algorithm called L-SHAP (a variant of SHAP) that computes approximate Shapley values as attribution scores.

Parameters:

input_orig (str or List[str]) –

[input] Input text as a single unit (if str) or segmented sequence of units (List[str]).
unit_types (str or List[str], default: 'p' ) –

[input] Types of units in input_orig. "p" for paragraph, "s" for sentence, "w" for word, "n" for not to be perturbed/attributed to. If str, applies to all units in input_orig, otherwise unit-specific.
ind_interest (bool or List[bool] or None, default: None ) –

[input] Indicator of units to attribute to ("of interest"). Default None means np.array(unit_types) != "n".
ind_segment (bool or List[bool], default: True ) –

[segmentation] Whether to segment input text. If bool, applies to all units; if List[bool], applies to each unit individually.
segment_type (str, default: 's' ) –

[segmentation] Type of units to segment into: "s" for sentences, "w" for words, "ph" for phrases.
max_phrase_length (int, default: 10 ) –

[segmentation] Maximum phrase length in terms of spaCy tokens (default 10).
model_params (dict, default: {} ) –

Additional keyword arguments for model generation (for the self.model.generate() method).
scalarize_params (dict, default: {} ) –

Additional keyword arguments for computing scalar outputs (for the self.scalarized_model.scalarize_output() method).
num_neighbors (int, default: 2 ) –

[perturbation] Number of neighbors on either side of unit of interest that can be perturbed. Default 2 (as an example) means two neighbors to the left AND two neighbors to the right.
max_units_replace (int, default: 2 ) –

[perturbation] Maximum number of units to perturb at one time (default 2).
replacement_str (str, default: '' ) –

[perturbation] String to replace units with (default "" for dropping units).

Returns:

output_dict ( dict ) –

Dictionary with the following items: "attributions" (dict): Dictionary with attribution scores, corresponding input units, and unit types. "output_orig" (icx360.utils.model_wrappers.GeneratedOutput): Output object generated from original input.

Items in "attributions" dictionary: "units" (List[str]): input_orig segmented into units if not already, otherwise same as original. "unit_types" (List[str]): Types of units. score_label ((num_units,) np.ndarray): One or more sets of attribution scores (labelled by the type of scalarizer).

set_params

set_params(*argv, **kwargs)

Set parameters for the explainer.

adapt_replacement_set

adapt_replacement_set(idx_replace, idx_interest, num_neighbors)

Adapt set of units that can be replaced to the unit of interest.

This function modifies the indices of units that can be replaced to exclude the unit of interest and include neighbors within a specified range on either side.

Parameters:

idx_replace (np.ndarray of dtype int) –

Indices of units that can be replaced.
idx_interest (int) –

Index of the unit of interest.
num_neighbors (int) –

Number of neighbors on either side of the unit of interest to include.

Returns:

idx_replace_adapted ( np.ndarray of dtype int ) –

Adapted version of idx_replace, excluding the unit of interest and including neighbors.

get_normalization_constants

get_normalization_constants(num_can_replace, max_units_replace)

Computes normalization constants for Shapley value calculation.

Parameters:

num_can_replace (int) –

The total number of units that can be replaced.
max_units_replace (int) –

The maximum number of units that can be replaced at one time.

Returns:

normalization ( ndarray ) –

An array of normalization constants.

token_highlighter

Module containing TokenHighlighter submodule (thllm)

Modules:

th_llm –

Class for TokenHilighter explainer (TH-LLM).

th_llm

Class for TokenHilighter explainer (TH-LLM). Interpreting LLMs based on the importance analysis of input text units.

Classes:

TokenHighlighter –

Class for TokenHilighter explainer (TH-LLM).

TokenHighlighter

TokenHighlighter(model, tokenizer, segmenter, **kwargs)

Bases: LocalWBExplainer

Class for TokenHilighter explainer (TH-LLM). Interpreting LLMs based on the importance analysis of input text units.

Initialize the TH-LLM explainer.

Parameters:

model –

The large language model object.
tokenizer –

The tokenizer object.
segmenter –

The segmenter object.
affirmation –

The affirmation sentence template.
pooling –

The aggregation method ("norm_mean", "mean_norm", or "matrix").

Methods:

explain_instance –

Compute importance scores for each text unit.
explain_instance_matrix –

Use the Frobenius norm of the token gradient matrix as the importance score for each unit.
explain_instance_mean_norm –

Use the average of the L2 norms of token gradients as the importance score for each unit.
explain_instance_norm_mean –

Use the L2 norm of the average of token gradients as the importance score for each unit.
set_params –

Set parameters for the explainer.

Attributes:

m –
pooling (str) –
segmenter –
tok –
token_ids –

m instance-attribute

m = model

pooling instance-attribute

pooling: str = get('pooling', 'mean_norm')

segmenter instance-attribute

segmenter = SpaCySegmenter(segmenter)

tok instance-attribute

tok = tokenizer

token_ids instance-attribute

token_ids = _get_token_ids(prefix, infix, affirmation, suffix)

explain_instance

explain_instance(input_orig, unit_types, ind_segment, segment_type, **kwargs)

Compute importance scores for each text unit.

Parameters:

input_orig (str) –

Original input text.
unit_types (Union[str, List[str]]) –

Type(s) of each text unit.
ind_segment (Union[bool, List[bool]]) –

Whether to segment.
segment_type (str) –

Type of segmentation to apply.
max_phrase_length (int) –

Max length allowed for a phrase.

Returns:

–

Dict[str, Any]: Attribution information dictionary.

explain_instance_matrix

explain_instance_matrix(units: List[str]) -> Tuple[List[str], List[float]]

Use the Frobenius norm of the token gradient matrix as the importance score for each unit.

Parameters:

units (List[str]) –

A list of text units (e.g., phrases or words) that form the prompt.

Returns:

Tuple[List[str], List[float]] –

Tuple[List[str], List[float]]: A tuple containing: - The list of units. - The unit scores based on Frobenius norms of token gradients.

explain_instance_mean_norm

explain_instance_mean_norm(units: List[str]) -> Tuple[List[str], List[float]]

Use the average of the L2 norms of token gradients as the importance score for each unit.

Parameters:

units (List[str]) –

A list of text units (e.g., phrases or words) that form the prompt.

Returns:

Tuple[List[str], List[float]] –

Tuple[List[str], List[float]]: A tuple containing: - The list of units. - The unit scores based on the mean_norm method.

explain_instance_norm_mean

explain_instance_norm_mean(units: List[str]) -> Tuple[List[str], List[float]]

Use the L2 norm of the average of token gradients as the importance score for each unit.

Parameters:

units (List[str]) –

A list of text units (e.g., phrases or words) that form the prompt.

Returns:

Tuple[List[str], List[float]] –

Tuple[List[str], List[float]]: A tuple containing: - The list of units. - The unit scores based on the norm_mean method.

set_params

set_params(*argv, **kwargs)

Set parameters for the explainer.

metrics

Module containing metrics for explanations

Modules:

perturb_curve –

Perturbation curve evaluator for measuring the fidelity of input attributions to the explained model.

perturb_curve

Perturbation curve evaluator for measuring the fidelity of input attributions to the explained model.

The PerturbCurveEvaluator class evaluates perturbation curves for input attribution scores produced by icx360.algorithms.mexgen.CLIME.explain_instance() or icx360.algorithms.mexgen.LSHAP.explain_instance(). It thus evaluates the fidelity of these attribution scores to the explained model.

Classes:

PerturbCurveEvaluator –

Perturbation curve evaluator for measuring the fidelity of input attributions to the explained model.

PerturbCurveEvaluator

PerturbCurveEvaluator(model, scalarizer='prob', **kwargs)

Perturbation curve evaluator for measuring the fidelity of input attributions to the explained model.

Attributes:

model (Model) –

Model to explain, wrapped in an icx360.utils.model_wrappers.Model object.
scalarized_model (Scalarizer) –

"Scalarized model" that further wraps model with a method for computing scalar values based on the model's inputs or outputs.

Initialize perturbation curve evaluator.

Parameters:

model (Model) –

Model to explain, wrapped in an icx360.utils.model_wrappers.Model object.
scalarizer (str, default: 'prob' ) –

Type of scalarizer to use. "prob": probability of generating original output conditioned on perturbed inputs (instantiates an icx360.utils.scalarizers.ProbScalarizedModel). "text": similarity scores between original output and perturbed outputs (instantiates an icx360.utils.scalarizers.TextScalarizedModel).
**kwargs (dict, default: {} ) –

Additional keyword arguments for initializing scalarizer.

Raises:

ValueError –

If scalarizer is not "prob" or "text".

Methods:

eval_perturb_curve –

Evaluate perturbation curve for given input attributions.

model `instance-attribute`

model = model

scalarized_model `instance-attribute`

scalarized_model = ProbScalarizedModel(model)

eval_perturb_curve

eval_perturb_curve(explainer_dict, score_label, token_frac=False, max_frac_perturb=0.5, replacement_str='', model_params={}, scalarize_params={})

Evaluate perturbation curve for given input attributions.

This method evaluates the perturbation curve for a set of attribution scores by perturbing units in decreasing order of their attribution scores.

Parameters:

explainer_dict (dict) –

Attribution dictionary as produced by icx360.algorithms.mexgen.CLIME.explain_instance() or icx360.algorithms.mexgen.LSHAP.explain_instance().
score_label (str) –

Label of the attribution score to use for ranking units.
token_frac (bool, default: False ) –

Whether to consider the number of tokens in each unit when ranking and perturbing units. Defaults to False.
max_frac_perturb (float, default: 0.5 ) –

Maximum fraction of units or tokens to perturb. Defaults to 0.5.
replacement_str (str, default: '' ) –

String to replace perturbed units with. Defaults to "" for dropping units.
model_params (dict, default: {} ) –

Additional keyword arguments for model generation (for the self.model.generate() method).
scalarize_params (dict, default: {} ) –

Additional keyword arguments for computing scalar outputs (for the self.scalarized_model.scalarize_output() method).

Returns:

output_perturbed ( dict ) –

Dictionary with the following items: "frac" (torch.Tensor): Fractions of units or tokens perturbed. score_label (torch.Tensor): One or more Tensors of scalarized output values corresponding to the fractions in the "frac" Tensor. score_label labels each Tensor with the type of scalarizer.

Raises:

ValueError –

If token_frac is True and model's tokenizer is not available.

utils

Module containing various utilities including model wrappers, infillers, scalarizers, and segmenters, among others

Modules:

coloring_utils –

Utilities for coloring and displaying units of text.
general_utils –

File containing general utility functions
infillers –
model_wrappers –

Module containing wrappers for different types of models (used by MExGen and CELL).
scalarizers –

Module containing scalarizers, which compute scalar output values based on the outputs or inputs of an LLM.
segmenters –

Module containing utilities for segmenting input text into units.
subset_utils –

Utilities that deal with subsets of input units.
toma –

Model inference utilities that use the toma package to avoid running out of CUDA memory.

coloring_utils

Utilities for coloring and displaying units of text.

Functions:

color_units –

Color units of text according to scores and display.
highlight_text –

Attributes:

COLOR_LIST_IBM_30 –
COLOR_LIST_IBM_40 –

COLOR_LIST_IBM_30 `module-attribute`

COLOR_LIST_IBM_30 = ['#a6c8ff', '#c6c6c6', '#ffb3b8']

COLOR_LIST_IBM_40 `module-attribute`

COLOR_LIST_IBM_40 = ['#78a9ff', '#c6c6c6', '#ff8389']

color_units

color_units(units, scores, norm_factor=None, scale_sqrt=True, color_list=COLOR_LIST_IBM_40, show=True)

Color units of text according to scores and display.

Parameters:

units (num_units,) np.ndarray) –

Units of text.
scores (num_units,) np.ndarray) –

Scores corresponding to units.
norm_factor (float or None, default: None ) –

Factor to divide scores by to normalize them. None (default) means np.abs(scores).max().
scale_sqrt (bool, default: True ) –

Whether to apply square root to magnitude of score
color_list (List[str], default: COLOR_LIST_IBM_40 ) –

List of colors for matplotlib.colors.LinearSegmentedColormap
show (bool, default: True ) –

Show on screen if True, otherwise return list of HTML strings.

Returns:

colored_units ( List[str] or None ) –

List of HTML-formatted units of text if show==False, otherwise None.

highlight_text

highlight_text(unit, color)

general_utils

File containing general utility functions

Functions:

fix_seed –

Fix a random seeed for all random number generators (random, numpy, torch)
select_device –

Select device on which to perform all operations.

fix_seed

fix_seed(seed=12345)

Fix a random seeed for all random number generators (random, numpy, torch)

Parameters:

seed –

seed to set for all randomizations

select_device

select_device()

Select device on which to perform all operations.

Returns:

device ( str ) –

device on which to perform all operations according to user system

infillers

Modules:

BART_infiller –

File containing class BART_infiller
T5_infiller –

File containing class T5_infiller

BART_infiller

File containing class BART_infiller

BART_infiller is used to perform infilling using a BART LLM.

Classes:

BART_infiller –

BART_infiller object.

BART_infiller

BART_infiller(model_path='facebook/bart-large', device='cuda')

BART_infiller object.

Instances can be used to encode, decode, and generate text to infill masks in text.

Attributes _model: BART model used for infilling _tokenizer: BART tokenizer mask_string: text that represents a mask for BART mask_string_encoded: encoded version of mask for BART mask_filled_error: text representing that an infilling error occurred

Initialize BART infilling object.

Parameters:

model_path (str, default: 'facebook/bart-large' ) –

name of BART model to be used for infilling

Methods:

decode –

Function to decode text via BART tokenizer
encode –

Function to encode text via BART tokenizer
generate –

Generate text to infill mask tokens. Assumes one of tokens is
get_infilled_mask –

Retrieve text that replaced when infilling from generation
similar –

Determine if word is similar to fill_in

Attributes:

mask_filled_error –
mask_string –
mask_string_encoded –

mask_filled_error instance-attribute

mask_filled_error = '!!abcxyz!!'

mask_string instance-attribute

mask_string = '<mask>'

mask_string_encoded instance-attribute

mask_string_encoded = encode(mask_string, add_special_tokens=False)[0]

decode

decode(tokens, skip_special_tokens=True)

Function to decode text via BART tokenizer

Parameters:

tokens (int list) –

token indices
skip_special_tokens (bool, default: True ) –

True if to skip special tokens in decoding

Returns:

ret ( str ) –

string frame decoding all input tokens

encode

encode(text, add_special_tokens=False)

Function to encode text via BART tokenizer

Parameters:

text (str) –

string to encode
add_special_tokens (bool, default: False ) –

True if to use special tokens in encoding

Returns:

ret ( int list ) –

token indices where n is based on input text

generate

generate(tokens, num_return_sequences=1, masked_word='', return_mask_filled=False)

Generate text to infill mask tokens. Assumes one of tokens is which is token id self.mask_string_encoded Args: tokens (int list): token indices num_return_sequences (int): number of generations to return (default: 1) masked_word (str): word that is masked in tokens (default: '') return_mask_filled (bool): if true, return (ret, mask_filled), else return only ret

Returns:

ret ( int list ) –

list of token indices after calling model.generate on input tokens
mask_filled ( str ) –

decoded version of infilled texts

get_infilled_mask

get_infilled_mask(x_enc, y_enc, return_tokens=False)

Retrieve text that replaced when infilling from generation output

Parameters:

x_enc (int list) –

token indices where one token is , i.e. input to generation function
y_enc (int list) –

token indices representing same as x_enc with several tokens replacing , i.e., output of generation function
return_tokens (bool, default: False ) –

if true, return (mask_filled, inds_infill), else return only mask_filled

Returns:

mask_filled ( str ) –

decoded tokens that replace in y_enc from x_enc
inds_infill ( int list ) –

token indices representing encoded version of infilled text

similar

similar(word, fill_in)

Determine if word is similar to fill_in

Parameters:

word (str) –

words to search for
fill_in (str) –

filled in text to search for word in

Returns:

ret ( bool ) –

True if word is similar to fill_in, False otherwise

T5_infiller

File containing class T5_infiller

T5_infiller is used to perform infilling using a T5 LLM.

Classes:

T5_infiller –

T5_infiller object.

T5_infiller

T5_infiller(model_path='t5-large', device='cuda')

T5_infiller object.

Instances can be used to encode, decode, and generate text to infill masks in text.

Attributes _model: T5 model used for infilling _tokenizer: T5 tokenizer mask_string: text that represents beginning of mask for T5 mask_string_end: text that represent end of mask for T5 mask_string_encoded: encoded version of mask_string for T5 mask_string_end_encoded: encoded version of mask_string_end for T5 mask_filled_error: text representing that an infilling error occurred

Initialize T5 infilling object.

Parameters:

model_path (str, default: 't5-large' ) –

name of T5 model to be used for infilling

Methods:

decode –

Function to decode text via T5 tokenizer
encode –

Function to encode text via T5 tokenizer
generate –

Generate text to infill mask tokens. Assumes one of tokens is
get_infilled_mask –

Retrieve text that replaced when infilling from generation
similar –

Determine if word is similar to fill_in

Attributes:

mask_filled_error –
mask_string –
mask_string_encoded –
mask_string_end –
mask_string_end_encoded –

mask_filled_error instance-attribute

mask_filled_error = '!!abcxyz!!'

mask_string instance-attribute

mask_string = '<extra_id_0>'

mask_string_encoded instance-attribute

mask_string_encoded = encode(mask_string, add_special_tokens=False)[0]

mask_string_end instance-attribute

mask_string_end = '<extra_id_1>'

mask_string_end_encoded instance-attribute

mask_string_end_encoded = encode(mask_string_end, add_special_tokens=False)[0]

decode

decode(tokens, skip_special_tokens=True)

Function to decode text via T5 tokenizer

Parameters:

tokens (int list) –

token indices
skip_special_tokens (bool, default: True ) –

True if to skip special tokens in decoding

Returns:

ret ( str ) –

string frame decoding all input tokens

encode

encode(text, add_special_tokens=False)

Function to encode text via T5 tokenizer

Parameters:

text (str) –

string to encode
add_special_tokens (bool, default: False ) –

True if to use special tokens in encoding

Returns:

ret ( int list ) –

token indices where n is based on input text

generate

generate(tokens, num_return_sequences=1, masked_word='', return_mask_filled=False)

Generate text to infill mask tokens. Assumes one of tokens is which is token id self.mask_string_encoded Args: tokens (int list): token indices num_return_sequences (int): number of generations to return (default: 1) masked_word (str): word that is masked in tokens (default: '') return_mask_filled (bool): if true, return (ret, mask_filled), else return only ret

Returns:

ret ( int list ) –

list of token indices after calling model.generate on input tokens
mask_filled ( str ) –

decoded version of infilled texts

get_infilled_mask

get_infilled_mask(x_enc, y_enc, return_tokens=False)

Retrieve text that replaced when infilling from generation output

Parameters:

x_enc (int list) –

token indices where one token is , i.e. input to generation function
y_enc (int list) –

token indices representing same as x_enc with several tokens replacing , i.e., output of generation function
return_tokens (bool, default: False ) –

if true, return (mask_filled, inds_infill), else return only mask_filled

Returns:

mask_filled ( str ) –

decoded tokens that replace which are between and in x_enc
inds_infill ( int list ) –

tokens that represent the replacement for the mask (only returned if return_tokens==True)

similar

similar(word, fill_in)

Determine if word is similar to fill_in

Parameters:

word (str) –

words to search for
fill_in (str) –

filled in text to search for word in

Returns:

ret ( bool ) –

True if word is similar to fill_in, False otherwise

model_wrappers

Module containing wrappers for different types of models (used by MExGen and CELL).

Modules:

base_model_wrapper –

Base class for model wrappers and class for model-generated outputs.
huggingface –

Wrapper for HuggingFace models.
vllm –

Wrapper for VLLM models.

base_model_wrapper

Base class for model wrappers and class for model-generated outputs.

Classes:

GeneratedOutput –

Holds outputs of generate() method.
Model –

Base class for wrappers of different types of models.

GeneratedOutput

GeneratedOutput(output_ids=None, output_text=None, output_token_count=None, logits=None)

Holds outputs of generate() method.

Attributes:

output_ids (Tensor or None) –

Generated token IDs for each input.
output_text (List[str] or None) –

Generated text for each input.
output_token_count (int or None) –

Maximum number of generated tokens.
logits (Tensor or None) –

Output logits for each input.

Initialize GeneratedOutput.

Parameters:

output_ids (Tensor or None, default: None ) –

Generated token IDs for each input.
output_text (List[str] or None, default: None ) –

Generated text for each input.
output_token_count (int or None, default: None ) –

Maximum number of generated tokens.
logits (Tensor or None, default: None ) –

Output logits for each input.

logits instance-attribute

logits = logits

output_ids instance-attribute

output_ids = output_ids

output_text instance-attribute

output_text = output_text

output_token_count instance-attribute

output_token_count = output_token_count

Model

Model(model)

Bases: ABC

Base class for wrappers of different types of models.

Attributes:

_model –

Underlying model object.

Initialize Model wrapper.

Parameters:

model –

Underlying model object.

Methods:

convert_input –

Convert input(s) as needed for the model type.
generate –

Generate response from model.

convert_input

convert_input(inputs)

Convert input(s) as needed for the model type.

Parameters:

inputs (str or List[str] or List[List[str]]) –

A single input text, a list of input texts, or a list of segmented texts.

Returns:

inputs ( type required by model ) –

Converted inputs.

generate abstractmethod

generate(inputs, text_only=True, **kwargs)

Generate response from model.

Parameters:

inputs (str or List[str] or List[List[str]]) –

A single input text, a list of input texts, or a list of segmented texts.
text_only (bool, default: True ) –

Return only generated text (default) or an object containing additional outputs.
**kwargs (dict, default: {} ) –

Additional keyword arguments for model.

Returns:

output_obj ( List[str] or GeneratedOutput ) –

If text_only == True, a list of generated texts corresponding to inputs. If text_only == False, a GeneratedOutput object to hold outputs.

huggingface

Wrapper for HuggingFace models.

Classes:

HFModel –

Wrapper for HuggingFace models.

HFModel

HFModel(model, tokenizer)

Bases: Model

Wrapper for HuggingFace models.

Attributes:

_model (transformers model object) –

Underlying model object.
_tokenizer (transformers tokenizer) –

Tokenizer corresponding to model.
_device (str) –

Device on which the model resides.

Initialize HFModel wrapper.

Parameters:

model (transformers model object) –

Underlying model object.
tokenizer (transformers tokenizer) –

Tokenizer corresponding to model.

Methods:

convert_input –

Encode input text as token IDs for HuggingFace model.
generate –

Generate response from model.

convert_input

convert_input(inputs, chat_template=False, system_prompt=None, **kwargs)

Encode input text as token IDs for HuggingFace model.

Parameters:

inputs (str or List[str] or List[List[str]]) –

A single input text, a list of input texts, or a list of segmented texts.
chat_template (bool, default: False ) –

Whether to apply chat template.
system_prompt (str or None, default: None ) –

System prompt to include in chat template.
**kwargs (dict, default: {} ) –

Additional keyword arguments for tokenizer.

Returns:

input_encoding ( BatchEncoding ) –

Object produced by tokenizer.

generate

generate(inputs, chat_template=False, system_prompt=None, tokenizer_kwargs={}, text_only=True, **kwargs)

Generate response from model.

Parameters:

inputs (str or List[str] or List[List[str]]) –

A single input text, a list of input texts, or a list of segmented texts.
chat_template (bool, default: False ) –

Whether to apply chat template.
system_prompt (str or None, default: None ) –

System prompt to include in chat template.
tokenizer_kwargs (dict, default: {} ) –

Additional keyword arguments for tokenizer.
text_only (bool, default: True ) –

Return only generated text (default) or an object containing additional outputs.
**kwargs (dict, default: {} ) –

Additional keyword arguments for HuggingFace model.

Returns:

output_obj ( List[str] or GeneratedOutput ) –

If text_only == True, a list of generated texts corresponding to inputs. If text_only == False, a GeneratedOutput object containing the following: output_ids: (num_inputs, output_token_count) torch.Tensor of generated token IDs. output_text: List of generated texts. output_token_count: Maximum number of generated tokens.

vllm

Wrapper for VLLM models.

Classes:

VLLMModel –

Wrapper for VLLM models.

VLLMModel

VLLMModel(model, model_name, tokenizer=None)

Bases: Model

Wrapper for VLLM models.

Attributes:

_model (OpenAI model object) –

Underlying model object.
_model_name (str) –

Name of the model.
_tokenizer (transformers tokenizer or None) –

HuggingFace tokenizer corresponding to the model (for applying chat template).

Initialize VLLMModel wrapper.

Parameters:

model (OpenAI model object) –

Underlying model object.
model_name (str) –

Name of the model.
tokenizer (transformers tokenizer or None, default: None ) –

HuggingFace tokenizer corresponding to the model (for applying chat template).

Methods:

convert_input –

Convert input(s) into a list of strings.
generate –

Generate response from model.

convert_input

convert_input(inputs, chat_template=False, system_prompt=None, **kwargs)

Convert input(s) into a list of strings.

Parameters:

inputs (str or List[str] or List[List[str]]) –

A single input text, a list of input texts, or a list of segmented texts.
chat_template (bool, default: False ) –

Whether to apply chat template.
system_prompt (str or None, default: None ) –

System prompt to include in chat template.

Returns:

inputs ( List[str] ) –

Converted input(s) as a list of strings.

generate

generate(inputs, chat_template=False, system_prompt=None, text_only=True, **kwargs)

Generate response from model.

Parameters:

inputs (str or List[str] or List[List[str]]) –

A single input text, a list of input texts, or a list of segmented texts.
chat_template (bool, default: False ) –

Whether to apply chat template.
system_prompt (str or None, default: None ) –

System prompt to include in chat template.
text_only (bool, default: True ) –

Return only generated text (default) or an object containing additional outputs.
**kwargs (dict, default: {} ) –

Additional keyword arguments for VLLM model.

Returns:

output_obj ( List[str] or GeneratedOutput ) –

If text_only == True, a list of generated texts corresponding to inputs. If text_only == False, a GeneratedOutput object containing the following: output_text: List of generated texts.

scalarizers

Module containing scalarizers, which compute scalar output values based on the outputs or inputs of an LLM.

Modules:

bart_score –

BARTScorer class used by icx360.utils.scalarizers.TextScalarizedModel.
base_scalarizer –

Base class for scalarizers.
bleu_scalarizer –

File containing class BleuScalarizer
contradiction_scalarizer –

File containing class ContradictionScalarizer
nli_scalarizer –

File containing class NLIScalarizer
preference_scalarizer –

File containing class PreferenceScalarizer
prob –

Scalarized model that computes the log probability of generating a reference output conditioned on inputs.
text_only –

Scalarized model that computes similarity scores between generated texts and a reference output text.

bart_score

BARTScorer class used by icx360.utils.scalarizers.TextScalarizedModel.

This file (excluding this docstring) is an exact copy of the core source file from the BARTScore authors: https://github.com/neulab/BARTScore/blob/main/bart_score.py. It is licensed under the Apache License Version 2.0.

For more information, please refer to the BARTScore paper: BARTScore: Evaluating Generated Text as Text Generation. Weizhe Yuan, Graham Neubig, and Pengfei Liu. Advances in Neural Information Processing Systems (NeurIPS) 2021.

Classes:

BARTScorer –

BARTScorer

BARTScorer(device='cuda:0', max_length=1024, checkpoint='facebook/bart-large-cnn')

Methods:

load –

Load model from paraphrase finetuning
multi_ref_score –
score –

Score a batch of examples
test –

Test

Attributes:

device –
loss_fct –
lsm –
max_length –
model –
tokenizer –

device instance-attribute

device = device

loss_fct instance-attribute

loss_fct = NLLLoss(reduction='none', ignore_index=pad_token_id)

lsm instance-attribute

lsm = LogSoftmax(dim=1)

max_length instance-attribute

max_length = max_length

model instance-attribute

model = from_pretrained(checkpoint)

tokenizer instance-attribute

tokenizer = from_pretrained(checkpoint)

load

load(path=None)

Load model from paraphrase finetuning

multi_ref_score

multi_ref_score(srcs, tgts: List[List[str]], agg='mean', batch_size=4)

score

score(srcs, tgts, batch_size=4)

Score a batch of examples

test

test(batch_size=3)

Test

base_scalarizer

Base class for scalarizers.

Scalarizers compute real-valued scalar outputs for text inputs or outputs of LLMs, for example by comparing the inputs to a reference input or the corresponding outputs to a reference output.

Classes:

Scalarizer –

Base class for scalarizers.

Scalarizer

Scalarizer(model=None)

Bases: ABC

Base class for scalarizers.

Attributes:

model (Model or None) –

Generative model, wrapped in an icx360.utils.model_wrappers.Model object (optional, default None).

Initialize Scalarizer.

Parameters:

model (Model or None, default: None ) –

Generative model, wrapped in an icx360.utils.model_wrappers.Model object (optional, default None).

Methods:

scalarize_output –

Compute scalar outputs.

model instance-attribute

model = model

scalarize_output abstractmethod

scalarize_output(inputs=None, outputs=None, ref_input=None, ref_output=None, **kwargs)

Compute scalar outputs.

Parameters:

inputs (str or List[str] or List[List[str]] or None, default: None ) –

Inputs to compute scalar outputs for: A single input text, a list of input texts, or a list of segmented texts.
outputs (str or List[str] or None, default: None ) –

Outputs to scalarize (corresponding to inputs).
ref_input (str or None, default: None ) –

Reference input used to scalarize.
ref_output (str or GeneratedOutput or None, default: None ) –

Reference output (text or GeneratedOutput object) used to scalarize.
**kwargs (dict, default: {} ) –

Additional keyword arguments.

Returns:

scalar_outputs ( (num_inputs,) torch.Tensor ) –

Scalar output for each input.

bleu_scalarizer

File containing class BleuScalarizer

This class is used to scalarize text using a Bleu metric

Classes:

BleuScalarizer –

BleuScalarizer object.

BleuScalarizer

BleuScalarizer(model_path='', device='cuda', experiment_id='id')

Bases: Scalarizer

BleuScalarizer object.

Instances of BleuScalarizer can call scalarize_output to produce scalarized version of input text accoring to BLEU score

Attributes _bleu: model for computing BLEU score _device: device on which to perform computations

Initialize bleu scalarizer object.

Parameters:

model_path (str, default: '' ) –

placeholder. deprecated here.
device (str, default: 'cuda' ) –

device on which to perform computations
experiment_id (str, default: 'id' ) –

unique string for parallel scores to be computed without issue

Methods:

scalarize_output –

Convert text input and outputs to numerical score

Attributes:

model –

model instance-attribute

model = model

scalarize_output

scalarize_output(inputs, outputs, ref_input='', ref_output='', input_label=0, info=False)

Convert text input and outputs to numerical score

Use BLEU score to scalarize. Compute BLEU(outputs, ref_output) and BLEU(inputs, ref_input) and output a linear combination of BLEU scores

Parameters:

inputs (str) –

input prompt
outputs (str) –

response to input prompt
ref_input (str, default: '' ) –

contrastive prompt
ref_output (str, default: '' ) –

response to contrastive prompt
input_label (int, default: 0 ) –

placeholder. not used here.
info (bool, default: False ) –

placeholder. not used here.

Returns:

score ( float ) –

scalarized output
label_contrast ( int ) –

placeholder. not used here.

contradiction_scalarizer

File containing class ContradictionScalarizer

This class is used to scalarize text using a Contradiction metric via Natural Language Inference (NLI)

Classes:

ContradictionScalarizer –

ContradictionScalarizer object.

ContradictionScalarizer

ContradictionScalarizer(model_path='cross-encoder/nli-deberta-v3-base', device='cuda')

Bases: Scalarizer

ContradictionScalarizer object.

Instances of ContradictionScalarizer can call scalarize_output to produce scalarized version of input text accoring to Contradiction score

Attributes _model: NLI model for computing contradiction score _tokenizer: tokenizer of NLI model _device: device on which to perform computations

Initialize contradiction scalarizer object.

Parameters:

model_path (str, default: 'cross-encoder/nli-deberta-v3-base' ) –

NLI model for computing contradiction score
device (str, default: 'cuda' ) –

device on which to perform computations

Methods:

predict_contradiction –

Convert text input and outputs to 0/1 classification
scalarize_output –

Convert text input and outputs to numerical score

Attributes:

model –

model instance-attribute

model = model

predict_contradiction

predict_contradiction(inputs, outputs, ref_input='', ref_output='')

Convert text input and outputs to 0/1 classification

Use NLI contradiction score to scalarize. Compute if ref_output contradicts outputs and normalize by contradiction score of outputs to outputs.

Parameters:

inputs (str) –

placeholder. not used here.
outputs (str) –

response to input prompt
ref_input (str, default: '' ) –

placeholder. not used here.
ref_output (str, default: '' ) –

response to contrastive prompt

Returns:

ret ( int ) –

if contradiction found return 1, else return 0.

scalarize_output

scalarize_output(inputs, outputs, ref_input='', ref_output='', input_label=0, info=False)

Convert text input and outputs to numerical score

Use NLI contradiction score to scalarize. Compute if ref_output contradicts outputs and normalize by contradiction score of outputs to outputs.

Parameters:

inputs (str) –

placeholder. not used here.
outputs (str) –

response to input prompt
ref_input (str, default: '' ) –

placeholder. not used here.
ref_output (str, default: '' ) –

response to contrastive prompt
input_label (int, default: 0 ) –

placeholder. not used here.
info (bool, default: False ) –

print extra information if True

Returns:

score ( float ) –

scalarized output
label_contrast ( int ) –

placeholder. not used here.

nli_scalarizer

File containing class NLIScalarizer

This class is used to scalarize text using an Natural Language Inference (NLI) score to measure the change in scores

Classes:

NLIScalarizer –

NLIScalarizer object.

NLIScalarizer

NLIScalarizer(model_path='cross-encoder/nli-deberta-v3-base', device='cuda')

Bases: Scalarizer

NLIScalarizer object.

Instances of NLIScalarizer can call scalarize_output to produce scalarized version of input text accoring to change in NLI score

Attributes _model: NLI model for computing contradiction score _tokenizer: tokenizer of NLI model _device: device on which to perform computations

Initialize nli scalarizer object.

Parameters:

model_path (str, default: 'cross-encoder/nli-deberta-v3-base' ) –

NLI model for computing nli score
device (str, default: 'cuda' ) –

device on which to perform computations

Methods:

scalarize_output –

Convert text input and outputs to numerical score

Attributes:

model –

model instance-attribute

model = model

scalarize_output

scalarize_output(inputs, outputs, ref_input='', ref_output='', input_label=0, info=False)

Convert text input and outputs to numerical score

Use NLI score to scalarize. Compute score of NLI prediction of NLI(inputs, outputs) and compute change in score for that class of NLI(inputs, ref_output)

Parameters:

inputs (str) –

input prompt
outputs (str) –

response to input prompt
ref_input (str, default: '' ) –

placeholder. not used here.
ref_output (str, default: '' ) –

response to contrastive prompt
input_label (int, default: 0 ) –

placeholder. not used here.
info (bool, default: False ) –

print extra information if True

Returns:

score ( float ) –

scalarized output
label_contrast ( int ) –

placeholder. not used here.

preference_scalarizer

File containing class PreferenceScalarizer

This class is used to scalarize text using a preference model to measure the change in preference for a contrastive response to the original prompt

Classes:

PreferenceScalarizer –

PreferenceScalarizer object.

PreferenceScalarizer

PreferenceScalarizer(model_path='stanfordnlp/SteamSHP-flan-t5-large', device='cuda')

Bases: Scalarizer

PreferenceScalarizer object.

Instances of PreferenceScalarizer can call scalarize_output to produce scalarized version of input text accoring to change in preference of a contrastive response relative to the initial response

Attributes _model: model for computing preference score _tokenizer: tokenizer of preference model _device: device on which to perform computations

Initialize preference scalarizer object.

Parameters:

model_path (str, default: 'stanfordnlp/SteamSHP-flan-t5-large' ) –

preference model
device (str, default: 'cuda' ) –

device on which to perform computations

Methods:

scalarize_output –

Convert text input and outputs to numerical score

Attributes:

model –

model instance-attribute

model = model

scalarize_output

scalarize_output(inputs, outputs, ref_input='', ref_output='', input_label=0, info=False)

Convert text input and outputs to numerical score

Use preference score to scalarize. Compute preference for prompt inputs relative to two different responses, outputs and ref_output.

Parameters:

inputs (str) –

input prompt
outputs (str) –

response to input prompt
ref_input (str, default: '' ) –

placeholder. not used here.
ref_output (str, default: '' ) –

response to contrastive prompt
input_label (int, default: 0 ) –

placeholder. not used here.
info (bool, default: False ) –

placeholder. not used here.

Returns:

score ( float ) –

scalarized output
label_contrast ( int ) –

placeholder. not used here.

prob

Scalarized model that computes the log probability of generating a reference output conditioned on inputs.

This "scalarized model" is a generative model that can also compute the log probability (or a transformation thereof) of generating a given reference output conditioned on inputs.

Classes:

ProbScalarizedModel –

Generative model that also computes the probability of a given reference output conditioned on inputs.

ProbScalarizedModel

ProbScalarizedModel(model)

Bases: Scalarizer

Generative model that also computes the probability of a given reference output conditioned on inputs.

Attributes:

model (Model) –

Generative model, wrapped in an icx360.utils.model_wrappers.Model object.

Initialize ProbScalarizedModel.

Parameters:

model (Model) –

Generative model, wrapped in an icx360.utils.model_wrappers.Model object.

Raises:

TypeError –

If the model is not an icx360.utils.model_wrappers.HFModel or an icx360.utils.model_wrappers.VLLMModel.

Methods:

scalarize_output –

Compute probability of reference output conditioned on inputs.

model instance-attribute

model = model

scalarize_output

scalarize_output(inputs=None, outputs=None, ref_input=None, ref_output=None, chat_template=False, system_prompt=None, tokenizer_kwargs={}, transformation='log_prob_mean', **kwargs)

Compute probability of reference output conditioned on inputs.

Parameters:

inputs (str or List[str] or List[List[str]], default: None ) –

Inputs to compute probabilities for: A single input text, a list of input texts, or a list of segmented texts.
outputs (str or List[str] or None, default: None ) –

Outputs to scalarize (corresponding to inputs) - not used.
ref_input (str or None, default: None ) –

Reference input used to scalarize - not used.
ref_output (GeneratedOutput, default: None ) –

Reference output object.
chat_template (bool, default: False ) –

Whether to apply chat template.
system_prompt (str or None, default: None ) –

System prompt to include in chat template.
tokenizer_kwargs (dict, default: {} ) –

Additional keyword arguments for tokenizer.
transformation (str, default: 'log_prob_mean' ) –

Transformation to apply to token probabilities. "log_prob_mean": arithmetic mean of log probabilities (default). "log_prob_sum": sum of log probabilities. "prob_geo_mean": geometric mean of probabilities. "prob_prod": product of probabilities.
**kwargs (dict, default: {} ) –

Additional keyword arguments for model.

Returns:

probs_transformed ( (num_inputs,) torch.Tensor ) –

Transformed probability of generating the reference output conditioned on each input.

text_only

Scalarized model that computes similarity scores between generated texts and a reference output text.

This "scalarized model" is a generative model that can also compute similarity scores between the texts it generates and a reference output text.

Classes:

TextScalarizedModel –

Generative model that also computes similarity scores between its generated texts and a reference text.

TextScalarizedModel

TextScalarizedModel(model=None, sim_scores=['nli_logit', 'bert', 'st', 'summ', 'bart'], model_nli=None, model_bert=None, model_st='all-MiniLM-L6-v2', model_summ=None, model_bart='facebook/bart-large-cnn', device=None)

Bases: Scalarizer

Generative model that also computes similarity scores between its generated texts and a reference text.

Attributes:

model (Model) –

Generative model, wrapped in an icx360.utils.model_wrappers.Model object.
sim_scores (List[str]) –

List of similarity scores to compute. "nli_logit"/"nli": Logit/probability of entailment label from natural language inference model. "bert": BERTScore. "st": Cosine similarity between SentenceTransformer embeddings. "summ": Generation probability of a summarization model (similar to BARTScore). "bart": BARTScore.
model_nli (AutoModelForSequenceClassification) –

Natural language inference model.
tokenizer_nli (AutoTokenizer) –

Tokenizer for natural language inference model.
idx_entail (int) –

Index corresponding to entailment label.
bertscore (EvaluationModule) –

BERTScore evaluation module.
model_bert (str) –

Name of BERT-like model for computing BERTScore.
model_st (SentenceTransformer model) –

SentenceTransformer embedding model.
model_summ (AutoModelForSeq2SeqLM) –

Summarization model.
tokenizer_summ (AutoTokenizer) –

Tokenizer for summarization model.
bart_scorer (BARTScorer) –

Object for computing BARTScore.
device (device or str or None) –

Device for the above models.

Initialize TextScalarizedModel.

Parameters:

model (Model, default: None ) –

Generative model, wrapped in an icx360.utils.model_wrappers.Model object.
sim_scores (List[str], default: ['nli_logit', 'bert', 'st', 'summ', 'bart'] ) –

List of similarity scores to compute. "nli_logit"/"nli": Logit/probability of entailment label from natural language inference model. "bert": BERTScore. "st": Cosine similarity between SentenceTransformer embeddings. "summ": Generation probability of a summarization model (similar to BARTScore). "bart": BARTScore.
model_nli (str, default: None ) –

Name of natural language inference model.
model_bert (str, default: None ) –

Name of BERT-like model for computing BERTScore.
model_st (str, default: 'all-MiniLM-L6-v2' ) –

Name of SentenceTransformer embedding model.
model_summ (str, default: None ) –

Name of summarization model.
model_bart (str, default: 'facebook/bart-large-cnn' ) –

Name of BART-like model for computing BARTScore.
device (device or str or None, default: None ) –

Device for the above models.

Methods:

scalarize_output –

Compute similarity scores between generated texts and reference text.

bart_scorer instance-attribute

bart_scorer = BARTScorer(device=device, checkpoint=model_bart)

bertscore instance-attribute

bertscore = load('bertscore')

device instance-attribute

device = select_device() if device is None else device

idx_entail instance-attribute

idx_entail = label2id[key]

model instance-attribute

model = model

model_bert instance-attribute

model_bert = model_bert

model_nli instance-attribute

model_nli = to(device)

model_st instance-attribute

model_st = SentenceTransformer(model_st, device=device)

model_summ instance-attribute

model_summ = to(device)

sim_scores instance-attribute

sim_scores = sim_scores

tokenizer_nli instance-attribute

tokenizer_nli = from_pretrained(model_nli)

tokenizer_summ instance-attribute

tokenizer_summ = from_pretrained(model_summ)

scalarize_output

scalarize_output(inputs=None, outputs=None, ref_input=None, ref_output=None, max_new_tokens_factor=1.5, symmetric=True, idf=False, transformation='log_prob_mean', **kwargs)

Compute similarity scores between generated texts and reference text.

Parameters:

inputs (str or List[str] or List[List[str]] or None, default: None ) –

Inputs to compute similarity scores for: A single input text, a list of input texts, or a list of segmented texts.
outputs (List[str] or None, default: None ) –

Generated texts to compute similarity scores for. If None, then will be generated by calling self.model.generate().
ref_input (str or None, default: None ) –

Reference input used to scalarize - not used.
ref_output (GeneratedOutput, default: None ) –

Reference output object containing reference text (ref_output.output_text).
max_new_tokens_factor (float, default: 1.5 ) –

Multiplicative factor for setting max_new_tokens for generation.
symmetric (bool, default: True ) –

Make NLI entailment score symmetric (geometric mean of reference -> generated and generated -> reference).
idf (bool, default: False ) –

Use idf weighting for BERTScore.
transformation (str, default: 'log_prob_mean' ) –

Transformation to apply to output token probabilities of summarization model. "log_prob_mean": arithmetic mean of log probabilities (default). "log_prob_sum": sum of log probabilities. "prob_geo_mean": geometric mean of probabilities. "prob_prod": product of probabilities.
**kwargs (dict, default: {} ) –

Additional keyword arguments for model.

Returns:

scores ( dict of (num_inputs,) torch.Tensor ) –

For each label in self.sim_scores, a Tensor of corresponding similarity scores between generated texts and the reference text.

segmenters

Module containing utilities for segmenting input text into units.

Modules:

spacy –

Class and functions for segmenting input text into units using a spaCy model.
utils –

Other utilities for segmenting input text into units.

spacy

Class and functions for segmenting input text into units using a spaCy model.

SpaCySegmenter is the main class. The remaining functions implement an algorithm for segmentation into phrases.

Classes:

SpaCySegmenter –

Class for segmenting input text into units using a spaCy model.

Functions:

append_or_segment_children –

Append syntactic children of a node as phrases or further segment them.
append_or_segment_span –

Append span to list of phrases or further segment span.
is_not_punct_space –

Checks whether each token of span is not punctuation and not a space
merge_nbor_of_singleton_phrase –

Decide whether to merge neighbor of singleton (single-token) phrase.
merge_noun_chunk_phrases –

Merge phrases that constitute a noun chunk.
merge_phrase_spans –

Merge phrases within specified spans of phrases.
merge_singleton_phrases –

Merge single-token phrases with their neighbors.
segment_into_phrases –

Segment sentence (or span within sentence) into phrases.
sort_phrases –

Sort phrases by their starting token index.

SpaCySegmenter

SpaCySegmenter(spacy_model)

Class for segmenting input text into units using a spaCy model.

Attributes:

model (Language) –

spaCy model.

Initialize SpaCySegmenter object.

Parameters:

spacy_model (str) –

Name of spaCy model.

Methods:

segment_units –

(Further) Segment input text into units.

model instance-attribute

model = load(spacy_model)

segment_units

segment_units(input_text, ind_segment=True, unit_types='s', sent_idxs=None, segment_type='w', max_phrase_length=10)

(Further) Segment input text into units.

Parameters:

input_text (str or list[str]) –

Input text as a single unit (if str) or existing sequence of units (list[str]).
ind_segment (bool or list[bool], default: True ) –

Whether to segment entire input text or each existing unit. If bool, applies to all units. If list[bool], applies to each unit individually.
unit_types (str or list[str], default: 's' ) –

Types of units in input_text: "p" for paragraph, "s" for sentence, "w" for word, "n" for not to be perturbed or segmented (fixed). If str, applies to all units in input_text, otherwise unit-specific.
sent_idxs (list[int] or None, default: None ) –

Index of sentence (or larger unit) that contains each existing unit.
segment_type (str, default: 'w' ) –

Type of units to segment into: "s" for sentences, "w" for words, "ph" for phrases.
max_phrase_length (int, default: 10 ) –

Maximum phrase length in terms of spaCy tokens.

Returns:

units ( list[str] ) –

Resulting sequence of units.
unit_types ( list[str] ) –

Types of units.
sent_idxs_new ( list[int] ) –

Index of sentence (or larger unit) that contains each unit.

append_or_segment_children

append_or_segment_children(children, phrases, phrase_types, doc, max_phrase_length=10)

Append syntactic children of a node as phrases or further segment them.

Parameters:

children (generator[Token]) –

Generator of syntactic children.
phrases (list[Span]) –

List of current phrases.
phrase_types (list[str]) –

List of current phrase types.
doc (Doc) –

spaCy Doc containing the sentence.
max_phrase_length (int, default: 10 ) –

Maximum phrase length in terms of spaCy tokens.

Returns:

phrases ( list[Span] ) –

Updated list of phrases.
phrase_types ( list[str] ) –

Updated list of phrase types.
need_sort ( bool ) –

Flag to indicate whether phrases need sorting.

append_or_segment_span

append_or_segment_span(span, phrases, phrase_types, doc, max_phrase_length=10)

Append span to list of phrases or further segment span.

Parameters:

span (Span) –

Span to be appended or further segmented.
phrases (list[Span]) –

List of current phrases.
phrase_types (list[str]) –

List of current phrase types.
doc (Doc) –

spaCy Doc containing the sentence.
max_phrase_length (int, default: 10 ) –

Maximum phrase length in terms of spaCy tokens.

Returns:

phrases ( list[Span] ) –

Updated list of phrases.
phrase_types ( list[str] ) –

Updated list of phrase types.

is_not_punct_space

is_not_punct_space(span)

Checks whether each token of span is not punctuation and not a space

Returns:

–

A list of Booleans where each element is True iff corresponding token is not punctuation and not a space.

merge_nbor_of_singleton_phrase

merge_nbor_of_singleton_phrase(nbor, singleton, offset, max_nbor_length)

Decide whether to merge neighbor of singleton (single-token) phrase.

Evaluates conditions to determine if a neighboring phrase should be merged with a singleton phrase.

Parameters:

nbor (Span) –

Neighboring phrase.
singleton (Span) –

Singleton phrase.
offset (int) –

Absolute difference between indices of neighboring and singleton phrases.
max_nbor_length (int) –

Maximum neighbor length for merging in terms of spaCy tokens.

Returns:

ret ( bool ) –

Whether to merge neighbor.

merge_noun_chunk_phrases

merge_noun_chunk_phrases(phrases, phrase_types, noun_chunks, doc)

Merge phrases that constitute a noun chunk.

Parameters:

phrases (list[Span]) –

List of phrases.
phrase_types (list[str]) –

List of phrase types.
noun_chunks (generator[Span]) –

Generator of noun chunks.
doc (Doc) –

spaCy Doc containing the sentence.

Returns:

phrases_merged ( list[Span] ) –

List of merged phrases.
phrase_types_merged ( list[str] ) –

Types of merged phrases.

merge_phrase_spans

merge_phrase_spans(phrases, phrase_types, spans_merge, doc)

Merge phrases within specified spans of phrases.

Parameters:

phrases (list[Span]) –

List of phrases.
phrase_types (list[str]) –

List of phrase types.
spans_merge (list[tuple]) –

List of phrase spans, each a 2-element tuple of a starting phrase index and an ending phrase index.
doc (Doc) –

spaCy Doc containing the sentence.

Returns:

phrases_merged ( list[Span] ) –

List of merged phrases.
phrase_types_merged ( list[str] ) –

Types of merged phrases.

merge_singleton_phrases

merge_singleton_phrases(phrases, phrase_types, doc, max_phrase_length=10)

Merge single-token phrases with their neighbors.

Parameters:

phrases (list[Span]) –

List of phrases.
phrase_types (list[str]) –

List of phrase types.
doc (Doc) –

spaCy Doc containing the sentence.
max_phrase_length (int, default: 10 ) –

Maximum phrase length in terms of spaCy tokens.

Returns:

phrases_merged ( list[Span] ) –

List of merged phrases.
phrase_types_merged ( list[str] ) –

Types of merged phrases.

segment_into_phrases

segment_into_phrases(sent, doc, max_phrase_length=10)

Segment sentence (or span within sentence) into phrases.

Parameters:

sent (Span) –

Sentence or span to be segmented.
doc (Doc) –

spaCy Doc containing the sentence.
max_phrase_length (int, default: 10 ) –

Maximum phrase length in terms of spaCy tokens.

Returns:

phrases ( list[Span] ) –

List of segmented phrases.
phrase_types ( list[str] ) –

Types of phrases (e.g., "ROOT", "non-leaf", spaCy dependency labels).

sort_phrases

sort_phrases(phrases, phrase_types)

Sort phrases by their starting token index.

Parameters:

phrases (list[Span]) –

List of phrases.
phrase_types (list[str]) –

List of phrase types.

Returns:

phrases ( list[Span] ) –

Sorted list of phrases.
phrase_types ( list[str] ) –

Types of sorted phrases.

utils

Other utilities for segmenting input text into units.

Functions:

exclude_non_alphanumeric –

Exclude units without alphanumeric characters.

exclude_non_alphanumeric

exclude_non_alphanumeric(unit_types, units)

Exclude units without alphanumeric characters.

Modifies the unit_types list by setting the type of units without alphanumeric characters to "n".

Parameters:

unit_types (list[str]) –

Types of units.
units (list[str]) –

Sequence of units.

Returns:

unit_types ( list[str] ) –

Updated types of units.

subset_utils

Utilities that deal with subsets of input units.

These utilities are used by MExGen C-LIME (icx360.algorithms.mexgen.clime) and L-SHAP (icx360.algorithms.mexgen.lshap).

Functions:

mask_subsets –

Mask subsets of units with a fixed replacement string.
sample_subsets –

Sample subsets of input units that can be replaced.

mask_subsets

mask_subsets(units, subsets_replace, replacement_str)

Mask subsets of units with a fixed replacement string.

Parameters:

units (List[str]) –

Original sequence of units.
subsets_replace (List[List[int]]) –

A list of subsets to replace, where each subset is a list of unit indices.
replacement_str (str) –

String to replace units with (default "" for dropping units).

Returns:

input_masked ( List[List[str]] ) –

A list of masked versions of units, where each masked version corresponds to a subset in subsets_replace.

sample_subsets

sample_subsets(idx_replace, max_units_replace, oversampling_factor=None, num_return_sequences=None, empty_subset=False, return_weights=False)

Sample subsets of input units that can be replaced.

Parameters:

idx_replace (num_replace,) np.ndarray) –

Indices of units that can be replaced.
max_units_replace (int) –

Maximum number of units to replace at one time.
oversampling_factor (float or None, default: None ) –

Ratio of number of perturbed inputs to be generated to number of units that can be replaced. Default None means no upper bound on this ratio.
num_return_sequences (int or None, default: None ) –

Number of perturbed inputs to generate for each subset of units to replace.
empty_subset (bool, default: False ) –

Whether to include the empty subset.
return_weights (bool, default: False ) –

Whether to return weights associated with subsets.

Returns:

subsets ( list[list[int]] ) –

A list of subsets, where each subset is a list of unit indices.
weights ( list[float] ) –

Weights associated with subsets, only returned if return_weights==True.

toma

Model inference utilities that use the toma package to avoid running out of CUDA memory.

Functions:

toma_call –

Call model using the toma package to adapt to CUDA memory constraints.
toma_generate –

Generate outputs using the toma package to adapt to CUDA memory constraints.
toma_get_probs –

Compute log probabilities of tokens in a given reference output using the toma package to adapt to CUDA memory.

toma_call

toma_call(start, end, model, input_dict, logits, output_hidden_states=False, hidden_states=None)

Call model using the toma package to adapt to CUDA memory constraints.

This function passes a batch of inputs to a transformers classification model. It produces logits and, optionally, hidden states, and stores them in pre-allocated Tensors.

Parameters:

start (int) –

Index of the first input in the batch.
end (int) –

Index of the last input in the batch.
model (transformers model) –

Classification model.
input_dict (dict - like) –

Dict-like object produced by a HuggingFace tokenizer, containing input data.
logits (num_inputs, num_labels) torch.Tensor) –

Pre-allocated Tensor to store logits.
output_hidden_states (bool, default: False ) –

Whether to also output model's hidden states/representations.
hidden_states (tuple(Tensor) or None, default: None ) –

If output_hidden_states == True, then for each layer of the model, a pre-allocated (num_inputs, input_length, hidden_dim) Tensor of hidden states/representations, to be populated by toma_call. Otherwise, None.

Returns:

–

None.

This function modifies the provided logits Tensor in-place with predicted logits and, if requested, the hidden_states tuple with corresponding hidden states.

toma_generate

toma_generate(start, end, model, input_dict, output_ids, output_hidden_states=False, hidden_states=None, **kwargs)

Generate outputs using the toma package to adapt to CUDA memory constraints.

This function passes a batch of inputs to a transformers generative model. It generates token IDs and, optionally, hidden states, and stores them in pre-allocated Tensors.

Parameters:

start (int) –

Index of the first input in the batch.
end (int) –

Index of the last input in the batch.
model (transformers model) –

Generative model.
input_dict (dict - like) –

Dict-like object produced by a HuggingFace tokenizer, containing input data.
output_ids (num_inputs, gen_start + max_new_tokens) torch.Tensor) –

Pre-allocated Tensor to store generated token IDs.
output_hidden_states (bool, default: False ) –

Whether to also output model's hidden states/representations.
hidden_states (tuple(Tensor) or None, default: None ) –

If output_hidden_states == True, then for each layer of the encoder, a pre-allocated (num_inputs, input_length, hidden_dim) Tensor of hidden states/representations, to be populated by toma_generate. Otherwise, None.
**kwargs (dict, default: {} ) –

Additional keyword arguments for the HuggingFace model.

Returns:

–

None.

This function modifies the provided output_ids Tensor in-place with generated token IDs and, if requested, the hidden_states tuple with corresponding hidden states.

toma_get_probs

toma_get_probs(start, end, model, input_dict, ref_output, log_probs, output_hidden_states=False, hidden_states=None)

Compute log probabilities of tokens in a given reference output using the toma package to adapt to CUDA memory.

This function passes a batch of inputs to a transformers generative model. It computes log probabilities of reference output tokens conditioned on these outputs and, optionally, hidden states, and stores them in pre-allocated Tensors.

Parameters:

start (int) –

Index of the first input in the batch.
end (int) –

Index of the last input in the batch.
model (transformers model) –

Generative model.
input_dict (dict - like) –

Dict-like object produced by a HuggingFace tokenizer, containing input data.
ref_output (1, num_output_tokens) torch.Tensor) –

Token IDs of reference output to compute log probabilities for.
log_probs (num_inputs, gen_length) torch.Tensor) –

Pre-allocated Tensor to store log probabilities.
output_hidden_states (bool, default: False ) –

Whether to also output model's hidden states/representations.
hidden_states (tuple(Tensor) or None, default: None ) –

If output_hidden_states == True, then for each layer of the model, a pre-allocated (num_inputs, input_length, hidden_dim) Tensor of hidden states/representations, to be populated by toma_get_probs. Otherwise, None.

Returns:

–

None.

This function modifies the provided log_probs Tensor in-place with predicted log probabilities and, if requested, the hidden_states tuple with corresponding hidden states.

API reference

icx360

algorithms

cell

CELL

CELL

mCELL

mCELL

lbbe

ABC module-attribute

LocalBBExplainer

explain_instance abstractmethod

set_params

lwbe

ABC module-attribute

LocalWBExplainer

explain_instance abstractmethod

set_params

mexgen

clime

CLIME

compute_linear_model_features

fit_linear_model

lshap

LSHAP

adapt_replacement_set

get_normalization_constants

token_highlighter

th_llm

TokenHighlighter

metrics

perturb_curve

PerturbCurveEvaluator

model instance-attribute

scalarized_model instance-attribute

eval_perturb_curve

utils

coloring_utils

COLOR_LIST_IBM_30 module-attribute

COLOR_LIST_IBM_40 module-attribute

color_units

highlight_text

general_utils

fix_seed

select_device

infillers

BART_infiller

BART_infiller

T5_infiller

T5_infiller

model_wrappers

base_model_wrapper

GeneratedOutput

Model

huggingface

HFModel

vllm

VLLMModel

scalarizers

bart_score

BARTScorer

base_scalarizer

Scalarizer

bleu_scalarizer

BleuScalarizer

contradiction_scalarizer

ContradictionScalarizer

nli_scalarizer

NLIScalarizer

preference_scalarizer

PreferenceScalarizer

prob

ProbScalarizedModel

text_only

TextScalarizedModel

segmenters

spacy

SpaCySegmenter

append_or_segment_children

append_or_segment_span

ABC `module-attribute`

explain_instance `abstractmethod`

ABC `module-attribute`

explain_instance `abstractmethod`

model `instance-attribute`

scalarized_model `instance-attribute`

COLOR_LIST_IBM_30 `module-attribute`

COLOR_LIST_IBM_40 `module-attribute`