Hit Rate Metric¶
- pydantic model ibm_watsonx_gov.metrics.hit_rate.hit_rate_metric.HitRateMetric¶
Bases:
GenAIMetric
Defines the Hit Rate metric class.
The Hit Rate metric whether there is at least one relevant context among the retrieved contexts. The Context Relevance metric is computed as a pre requisite to compute this metric.
Examples
- Create Hit Rate metric with default parameters and compute using metrics evaluator.
metric = HitRateMetric() result = MetricsEvaluator().evaluate(data={"input_text": "...", "context": "..."}, metrics=[metric]) # A list of contexts can also be passed as shown below result = MetricsEvaluator().evaluate(data={"input_text": "...", "context": ["...", "..."]}, metrics=[metric])
- Create Hit Rate metric with a custom threshold.
threshold = MetricThreshold(type="lower_limit", value=0.5) metric = HitRateMetric(method=method, threshold=threshold)
- Create Hit Rate metric with llm_as_judge method.
# Define LLM Judge using watsonx.ai # To use other frameworks and models as llm_judge, see :module:`ibm_watsonx_gov.entities.foundation_model` llm_judge = LLMJudge(model=WxAIFoundationModel( model_id="google/flan-ul2", project_id="<PROJECT_ID>" )) cr_metric = ContextRelevanceMetric(llm_judge=llm_judge) ap_metric = HitRateMetric() result = MetricsEvaluator().evaluate(data={"input_text": "...", "context": ["...", "..."]}, metrics=[cr_metric, ap_metric])
Show JSON schema
{ "title": "HitRateMetric", "description": "Defines the Hit Rate metric class.\n\nThe Hit Rate metric whether there is at least one relevant context among the retrieved contexts.\nThe Context Relevance metric is computed as a pre requisite to compute this metric.\n\nExamples:\n 1. Create Hit Rate metric with default parameters and compute using metrics evaluator.\n .. code-block:: python\n\n metric = HitRateMetric()\n result = MetricsEvaluator().evaluate(data={\"input_text\": \"...\", \"context\": \"...\"},\n metrics=[metric])\n # A list of contexts can also be passed as shown below\n result = MetricsEvaluator().evaluate(data={\"input_text\": \"...\", \"context\": [\"...\", \"...\"]},\n metrics=[metric])\n\n 2. Create Hit Rate metric with a custom threshold.\n .. code-block:: python\n\n threshold = MetricThreshold(type=\"lower_limit\", value=0.5)\n metric = HitRateMetric(method=method, threshold=threshold)\n\n 3. Create Hit Rate metric with llm_as_judge method.\n .. code-block:: python\n\n # Define LLM Judge using watsonx.ai\n # To use other frameworks and models as llm_judge, see :module:`ibm_watsonx_gov.entities.foundation_model`\n llm_judge = LLMJudge(model=WxAIFoundationModel(\n model_id=\"google/flan-ul2\",\n project_id=\"<PROJECT_ID>\"\n ))\n cr_metric = ContextRelevanceMetric(llm_judge=llm_judge)\n ap_metric = HitRateMetric()\n result = MetricsEvaluator().evaluate(data={\"input_text\": \"...\", \"context\": [\"...\", \"...\"]},\n metrics=[cr_metric, ap_metric])", "type": "object", "properties": { "name": { "const": "hit_rate", "default": "hit_rate", "description": "The hit rate metric name.", "title": "Name", "type": "string" }, "thresholds": { "default": [ { "type": "lower_limit", "value": 0.7 } ], "description": "The metric thresholds.", "items": { "$ref": "#/$defs/MetricThreshold" }, "title": "Thresholds", "type": "array" }, "tasks": { "default": [ "retrieval_augmented_generation" ], "description": "The list of supported tasks.", "items": { "$ref": "#/$defs/TaskType" }, "title": "Tasks", "type": "array" }, "group": { "$ref": "#/$defs/MetricGroup", "default": "retrieval_quality", "description": "The metric group.", "title": "Group" }, "is_reference_free": { "default": true, "description": "Decides whether this metric needs a reference for computation", "title": "Is Reference Free", "type": "boolean" }, "method": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": null, "description": "The method used to compute the metric.", "title": "Method" }, "metric_dependencies": { "default": [ { "name": "context_relevance", "thresholds": [ { "type": "lower_limit", "value": 0.7 } ], "tasks": [ "retrieval_augmented_generation" ], "group": "retrieval_quality", "is_reference_free": true, "method": "token_precision", "metric_dependencies": [], "llm_judge": null, "compute_per_context": false, "id": "context_relevance_token_precision" } ], "description": "The list of metric dependencies", "items": { "$ref": "#/$defs/GenAIMetric" }, "title": "Metric dependencies", "type": "array" } }, "$defs": { "GenAIMetric": { "description": "Defines the Generative AI metric interface", "properties": { "name": { "description": "The name of the metric", "title": "Metric Name", "type": "string" }, "thresholds": { "default": [], "description": "The list of thresholds", "items": { "$ref": "#/$defs/MetricThreshold" }, "title": "Thresholds", "type": "array" }, "tasks": { "description": "The task types this metric is associated with.", "items": { "$ref": "#/$defs/TaskType" }, "title": "Tasks", "type": "array" }, "group": { "anyOf": [ { "$ref": "#/$defs/MetricGroup" }, { "type": "null" } ], "default": null, "description": "The metric group this metric belongs to." }, "is_reference_free": { "default": true, "description": "Decides whether this metric needs a reference for computation", "title": "Is Reference Free", "type": "boolean" }, "method": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": null, "description": "The method used to compute the metric.", "title": "Method" }, "metric_dependencies": { "default": [], "description": "Metrics that needs to be evaluated first", "items": { "$ref": "#/$defs/GenAIMetric" }, "title": "Metric Dependencies", "type": "array" } }, "required": [ "name", "tasks" ], "title": "GenAIMetric", "type": "object" }, "MetricGroup": { "enum": [ "retrieval_quality", "answer_quality", "content_safety", "performance", "usage", "tool_call_quality", "readability" ], "title": "MetricGroup", "type": "string" }, "MetricThreshold": { "description": "The class that defines the threshold for a metric.", "properties": { "type": { "description": "Threshold type. One of 'lower_limit', 'upper_limit'", "enum": [ "lower_limit", "upper_limit" ], "title": "Type", "type": "string" }, "value": { "default": 0, "description": "The value of metric threshold", "title": "Threshold value", "type": "number" } }, "required": [ "type" ], "title": "MetricThreshold", "type": "object" }, "TaskType": { "description": "Supported task types for generative AI models", "enum": [ "question_answering", "classification", "summarization", "generation", "extraction", "retrieval_augmented_generation" ], "title": "TaskType", "type": "string" } } }
- Fields:
group (Annotated[ibm_watsonx_gov.entities.enums.MetricGroup, FieldInfo(annotation=NoneType, required=False, default=
metric_dependencies (Annotated[list[ibm_watsonx_gov.entities.metric.GenAIMetric], FieldInfo(annotation=NoneType, required=False, default=[ContextRelevanceMetric(name='context_relevance', thresholds=[MetricThreshold(type='lower_limit', value=0.7)], tasks=[
tasks (Annotated[list[ibm_watsonx_gov.entities.enums.TaskType], FieldInfo(annotation=NoneType, required=False, default=[
- Validators:
- field group: ', frozen=True)] = MetricGroup.RETRIEVAL_QUALITY¶
The metric group.
- field metric_dependencies: RETRIEVAL_QUALITY: 'retrieval_quality'>, is_reference_free=True, method='token_precision', metric_dependencies=[], llm_judge=None, compute_per_context=False, id='context_relevance_token_precision')], title='Metric dependencies', description='The list of metric dependencies')] = [ContextRelevanceMetric(name='context_relevance', thresholds=[MetricThreshold(type='lower_limit', value=0.7)], tasks=[<TaskType.RAG: 'retrieval_augmented_generation'>], group=<MetricGroup.RETRIEVAL_QUALITY: 'retrieval_quality'>, is_reference_free=True, method='token_precision', metric_dependencies=[], llm_judge=None, compute_per_context=False, id='context_relevance_token_precision')]¶
The list of metric dependencies
- Validated by:
- field name: Annotated[Literal['hit_rate'], FieldInfo(annotation=NoneType, required=False, default='hit_rate', title='Name', description='The hit rate metric name.', frozen=True)] = 'hit_rate'¶
The hit rate metric name.
- field tasks: ')] = [TaskType.RAG]¶
The list of supported tasks.
- field thresholds: Annotated[list[MetricThreshold], FieldInfo(annotation=NoneType, required=False, default=[MetricThreshold(type='lower_limit', value=0.7)], title='Thresholds', description='The metric thresholds.')] = [MetricThreshold(type='lower_limit', value=0.7)]¶
The metric thresholds.
- evaluate(data: DataFrame, configuration: GenAIConfiguration | AgenticAIConfiguration, metrics_result: list[AggregateMetricResult], **kwargs) AggregateMetricResult ¶
- validator metric_dependencies_validator » metric_dependencies¶
- model_post_init(context: Any, /) None ¶
We need to both initialize private attributes and call the user-defined model_post_init method.
- pydantic model ibm_watsonx_gov.metrics.hit_rate.hit_rate_metric.HitRateResult¶
Bases:
RecordMetricResult
Show JSON schema
{ "title": "HitRateResult", "type": "object", "properties": { "name": { "default": "hit_rate", "title": "Name", "type": "string" }, "method": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": null, "description": "The method used to compute this metric result.", "examples": [ "token_recall" ], "title": "Method" }, "provider": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": null, "description": "The provider used to compute this metric result.", "title": "Provider" }, "value": { "anyOf": [ { "type": "number" }, { "type": "string" }, { "type": "boolean" }, { "type": "null" } ], "description": "The metric value.", "title": "Value" }, "errors": { "anyOf": [ { "items": { "$ref": "#/$defs/Error" }, "type": "array" }, { "type": "null" } ], "default": null, "description": "The list of error messages", "title": "Errors" }, "additional_info": { "anyOf": [ { "type": "object" }, { "type": "null" } ], "default": null, "description": "The additional information about the metric result.", "title": "Additional Info" }, "group": { "$ref": "#/$defs/MetricGroup", "default": "retrieval_quality" }, "thresholds": { "default": [], "description": "The metric thresholds", "items": { "$ref": "#/$defs/MetricThreshold" }, "title": "Thresholds", "type": "array" }, "record_id": { "description": "The record identifier.", "examples": [ "record1" ], "title": "Record Id", "type": "string" }, "record_timestamp": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": null, "description": "The record timestamp.", "examples": [ "2025-01-01T00:00:00.000000Z" ], "title": "Record Timestamp" } }, "$defs": { "Error": { "properties": { "code": { "description": "The error code", "title": "Code", "type": "string" }, "message_en": { "description": "The error message in English.", "title": "Message En", "type": "string" }, "parameters": { "default": [], "description": "The list of parameters to construct the message in a different locale.", "items": {}, "title": "Parameters", "type": "array" } }, "required": [ "code", "message_en" ], "title": "Error", "type": "object" }, "MetricGroup": { "enum": [ "retrieval_quality", "answer_quality", "content_safety", "performance", "usage", "tool_call_quality", "readability" ], "title": "MetricGroup", "type": "string" }, "MetricThreshold": { "description": "The class that defines the threshold for a metric.", "properties": { "type": { "description": "Threshold type. One of 'lower_limit', 'upper_limit'", "enum": [ "lower_limit", "upper_limit" ], "title": "Type", "type": "string" }, "value": { "default": 0, "description": "The value of metric threshold", "title": "Threshold value", "type": "number" } }, "required": [ "type" ], "title": "MetricThreshold", "type": "object" } }, "required": [ "value", "record_id" ] }
- Config:
arbitrary_types_allowed: bool = True
use_enum_values: bool = True
- Fields:
- field group: MetricGroup = MetricGroup.RETRIEVAL_QUALITY¶
- field name: str = 'hit_rate'¶