Evaluator API¶
UnitxtEvaluator ¶
Bases: BaseEvaluator
Unitxt wrapper making evaluation of the RAG's usage.
Functions¶
evaluate_metrics ¶
Perform evaluation on the given instances with chosen metric types.
Parameters:
-
evaluation_data(list[EvaluationData]) –Instances that hold data needed for the unitxt algorithms to perform evaluation.
-
metrics(Sequence[str]) –Values describing which specific evaluation metrics should be used withing evaluation process.
Returns:
-
dict–Dictionary of scores given for each EvaluationData.
Source code in ai4rag/evaluator/unitxt_evaluator.py
get_metric_types classmethod ¶
Perform mapping of general metric names to the specific metric names in the unitxt library.
Parameters:
-
metric_types(Sequence[str]) –Metrics defined in the MetricType class.
Returns:
-
list[str]–Specific versions of the metrics that can be used within unitxt evaluation process.
Source code in ai4rag/evaluator/unitxt_evaluator.py
decode_unitxt_metric classmethod ¶
Decode metrics from the unitxt names to general names.
Parameters:
-
unitxt_metrics(list[str]) –Encoded unitxt metrics.
Returns:
-
list[str]–Corresponding decoded messages