Tool Call Syntactic Accuracy Metric¶

pydantic model ibm_watsonx_gov.metrics.tool_call_syntactic_accuracy.tool_call_syntactic_accuracy_metric.ToolCallSyntacticAccuracyMetric¶

Bases: GenAIMetric

Deprecated since version 1.2.0: Use ibm_watsonx_gov.metrics.ToolCallAccuracyMetric with syntactic method instead.

ToolCallSyntacticAccuracyMetric compute the tool call syntactic correctness by validating tool call against the schema of the list of available tools.

The ToolCallSyntacticAccuracy metric will be computed by performing the syntactic checks.

Examples

Create ToolCallSyntacticAccuracy metric by passing the basic configuration.

config = GenAIConfiguration(tools = [get_weather,fetch_stock_price])
evaluator = MetricsEvaluator(configuration=config)
df = pd.read_csv("")
metrics = [ToolCallSyntacticAccuracyMetric()]
result = evaluator.evaluate(data=df, metrics=metrics)

Create ToolCallSyntacticAccuracy metric by passing custom tool calls field in configuration.

config = GenAIConfiguration(tools = [get_weather,fetch_stock_price],
                            tool_calls_field="tools_used")
evaluator = MetricsEvaluator(configuration=config)
df = pd.read_csv("")
metrics = [ToolCallSyntacticAccuracyMetric()]
result = evaluator.evaluate(data=df, metrics=metrics)

Create ToolCallSyntacticAccuracy metric with a custom threshold.

threshold  = MetricThreshold(type="upper_limit", value=0.8)
metric = ToolCallSyntacticAccuracyMetric(threshold=threshold)

Show JSON schema

{
   "title": "ToolCallSyntacticAccuracyMetric",
   "description": ".. deprecated:: 1.2.0\n    Use :class:`ibm_watsonx_gov.metrics.ToolCallAccuracyMetric` with syntactic method instead.\n\nToolCallSyntacticAccuracyMetric compute the tool call syntactic correctness \nby validating tool call against the schema of the list of available tools.\n\nThe ToolCallSyntacticAccuracy metric will be computed by performing the syntactic checks.\n\nExamples:\n    1. Create ToolCallSyntacticAccuracy metric by passing the basic configuration.\n        .. code-block:: python\n\n            config = GenAIConfiguration(tools = [get_weather,fetch_stock_price])\n            evaluator = MetricsEvaluator(configuration=config)\n            df = pd.read_csv(\"\")\n            metrics = [ToolCallSyntacticAccuracyMetric()]\n            result = evaluator.evaluate(data=df, metrics=metrics)\n\n    2. Create ToolCallSyntacticAccuracy metric by passing custom tool calls field in configuration.\n        .. code-block:: python\n\n            config = GenAIConfiguration(tools = [get_weather,fetch_stock_price],\n                                        tool_calls_field=\"tools_used\")\n            evaluator = MetricsEvaluator(configuration=config)\n            df = pd.read_csv(\"\")\n            metrics = [ToolCallSyntacticAccuracyMetric()]\n            result = evaluator.evaluate(data=df, metrics=metrics)\n\n    3. Create ToolCallSyntacticAccuracy metric with a custom threshold.\n        .. code-block:: python\n\n            threshold  = MetricThreshold(type=\"upper_limit\", value=0.8)\n            metric = ToolCallSyntacticAccuracyMetric(threshold=threshold)",
   "type": "object",
   "properties": {
      "name": {
         "const": "tool_call_syntactic_accuracy",
         "default": "tool_call_syntactic_accuracy",
         "description": "The name of metric.",
         "title": "Metric Name",
         "type": "string"
      },
      "display_name": {
         "const": "Tool Call Syntactic Accuracy",
         "default": "Tool Call Syntactic Accuracy",
         "description": "The tool call syntactic accuracy metric display name.",
         "title": "Display Name",
         "type": "string"
      },
      "type_": {
         "default": "ootb",
         "description": "The type of the metric. Indicates whether the metric is ootb or custom.",
         "examples": [
            "ootb",
            "custom"
         ],
         "title": "Metric type",
         "type": "string"
      },
      "value_type": {
         "default": "numeric",
         "description": "The type of the metric value. Indicates whether the metric value is numeric or categorical.",
         "examples": [
            "numeric",
            "categorical"
         ],
         "title": "Metric value type",
         "type": "string"
      },
      "thresholds": {
         "default": [
            {
               "type": "lower_limit",
               "value": 0.7
            }
         ],
         "description": "Value that defines the violation limit for the metric",
         "items": {
            "$ref": "#/$defs/MetricThreshold"
         },
         "title": "Metric threshold",
         "type": "array"
      },
      "tasks": {
         "default": [
            "retrieval_augmented_generation"
         ],
         "description": "The generative task type.",
         "items": {
            "$ref": "#/$defs/TaskType"
         },
         "title": "Task Type",
         "type": "array"
      },
      "group": {
         "$ref": "#/$defs/MetricGroup",
         "default": "tool_call_quality",
         "description": "The metric group.",
         "title": "Group"
      },
      "is_reference_free": {
         "default": true,
         "description": "Decides whether this metric needs a reference for computation",
         "title": "Is Reference Free",
         "type": "boolean"
      },
      "method": {
         "const": "syntactic_check",
         "default": "syntactic_check",
         "description": "The method used to compute the metric.",
         "title": "Computation Method",
         "type": "string"
      },
      "metric_dependencies": {
         "default": [],
         "description": "Metrics that needs to be evaluated first",
         "items": {
            "$ref": "#/$defs/GenAIMetric"
         },
         "title": "Metric Dependencies",
         "type": "array"
      },
      "applies_to": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": "message",
         "description": "The tag to indicate for which the metric is applied to. Used for agentic application metric computation.",
         "examples": [
            "message",
            "conversation",
            "sub_agent"
         ],
         "title": "Applies to"
      },
      "mapping": {
         "anyOf": [
            {
               "$ref": "#/$defs/Mapping"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "The data mapping details for the metric which are used to read the values needed to compute the metric.",
         "examples": {
            "items": [
               {
                  "attribute_name": "traceloop.entity.input",
                  "column_name": null,
                  "json_path": "$.inputs.input_text",
                  "lookup_child_spans": false,
                  "name": "input_text",
                  "span_name": "LangGraph.workflow",
                  "type": "input"
               },
               {
                  "attribute_name": "traceloop.entity.output",
                  "column_name": null,
                  "json_path": "$.outputs.generated_text",
                  "lookup_child_spans": false,
                  "name": "generated_text",
                  "span_name": "LangGraph.workflow",
                  "type": "output"
               }
            ],
            "source": "trace"
         },
         "title": "Mapping"
      }
   },
   "$defs": {
      "GenAIMetric": {
         "description": "Defines the Generative AI metric interface",
         "properties": {
            "name": {
               "description": "The name of the metric.",
               "examples": [
                  "answer_relevance",
                  "context_relevance"
               ],
               "title": "Metric Name",
               "type": "string"
            },
            "display_name": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "The display name of the metric.",
               "examples": [
                  "Answer Relevance",
                  "Context Relevance"
               ],
               "title": "Metric display name"
            },
            "type_": {
               "default": "ootb",
               "description": "The type of the metric. Indicates whether the metric is ootb or custom.",
               "examples": [
                  "ootb",
                  "custom"
               ],
               "title": "Metric type",
               "type": "string"
            },
            "value_type": {
               "default": "numeric",
               "description": "The type of the metric value. Indicates whether the metric value is numeric or categorical.",
               "examples": [
                  "numeric",
                  "categorical"
               ],
               "title": "Metric value type",
               "type": "string"
            },
            "thresholds": {
               "default": [],
               "description": "The list of thresholds",
               "items": {
                  "$ref": "#/$defs/MetricThreshold"
               },
               "title": "Thresholds",
               "type": "array"
            },
            "tasks": {
               "default": [],
               "description": "The task types this metric is associated with.",
               "items": {
                  "$ref": "#/$defs/TaskType"
               },
               "title": "Tasks",
               "type": "array"
            },
            "group": {
               "anyOf": [
                  {
                     "$ref": "#/$defs/MetricGroup"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "The metric group this metric belongs to."
            },
            "is_reference_free": {
               "default": true,
               "description": "Decides whether this metric needs a reference for computation",
               "title": "Is Reference Free",
               "type": "boolean"
            },
            "method": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "The method used to compute the metric.",
               "title": "Method"
            },
            "metric_dependencies": {
               "default": [],
               "description": "Metrics that needs to be evaluated first",
               "items": {
                  "$ref": "#/$defs/GenAIMetric"
               },
               "title": "Metric Dependencies",
               "type": "array"
            },
            "applies_to": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": "message",
               "description": "The tag to indicate for which the metric is applied to. Used for agentic application metric computation.",
               "examples": [
                  "message",
                  "conversation",
                  "sub_agent"
               ],
               "title": "Applies to"
            },
            "mapping": {
               "anyOf": [
                  {
                     "$ref": "#/$defs/Mapping"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "The data mapping details for the metric which are used to read the values needed to compute the metric.",
               "examples": {
                  "items": [
                     {
                        "attribute_name": "traceloop.entity.input",
                        "column_name": null,
                        "json_path": "$.inputs.input_text",
                        "lookup_child_spans": false,
                        "name": "input_text",
                        "span_name": "LangGraph.workflow",
                        "type": "input"
                     },
                     {
                        "attribute_name": "traceloop.entity.output",
                        "column_name": null,
                        "json_path": "$.outputs.generated_text",
                        "lookup_child_spans": false,
                        "name": "generated_text",
                        "span_name": "LangGraph.workflow",
                        "type": "output"
                     }
                  ],
                  "source": "trace"
               },
               "title": "Mapping"
            }
         },
         "required": [
            "name"
         ],
         "title": "GenAIMetric",
         "type": "object"
      },
      "Mapping": {
         "description": "Defines the field mapping details to be used for computing a metric.",
         "properties": {
            "source": {
               "default": "trace",
               "description": "The source type of the data. Use trace if the data should be read from span in trace. Use tabular if the data is passed as a dataframe.",
               "enum": [
                  "trace",
                  "tabular"
               ],
               "examples": [
                  "trace",
                  "tabular"
               ],
               "title": "Source",
               "type": "string"
            },
            "items": {
               "description": "The list of mapping items for the field. They are used to read the data from trace or tabular data for computing the metric.",
               "items": {
                  "$ref": "#/$defs/MappingItem"
               },
               "title": "Mapping Items",
               "type": "array"
            }
         },
         "required": [
            "items"
         ],
         "title": "Mapping",
         "type": "object"
      },
      "MappingItem": {
         "description": "The mapping details to be used for reading the values from the data.",
         "properties": {
            "name": {
               "description": "The name of the item.",
               "examples": [
                  "input_text",
                  "generated_text",
                  "context",
                  "ground_truth"
               ],
               "title": "Name",
               "type": "string"
            },
            "type": {
               "description": "The type of the item.",
               "enum": [
                  "input",
                  "output",
                  "reference",
                  "context",
                  "tool_call"
               ],
               "examples": [
                  "input"
               ],
               "title": "Type",
               "type": "string"
            },
            "column_name": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "The column name in the tabular data to be used for reading the field value. Applicable for tabular source.",
               "title": "Column Name"
            },
            "span_name": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "The span name in the trace data to be used for reading the field value. Applicable for trace source.",
               "title": "Span Name"
            },
            "attribute_name": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "The attribute name in the trace to be used for reading the field value. Applicable for trace source.",
               "title": "Attribute Name"
            },
            "json_path": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "description": "The json path to be used for reading the field value from the attribute value. Applicable for trace source. If not provided, the span attribute value is read as the field value.",
               "title": "Json Path"
            },
            "lookup_child_spans": {
               "anyOf": [
                  {
                     "type": "boolean"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": false,
               "description": "The flag to indicate if all the child spans should be searched for the attribute value. Applicable for trace source.",
               "title": "Look up child spans"
            }
         },
         "required": [
            "name",
            "type"
         ],
         "title": "MappingItem",
         "type": "object"
      },
      "MetricGroup": {
         "enum": [
            "retrieval_quality",
            "answer_quality",
            "content_safety",
            "performance",
            "usage",
            "message_completion",
            "tool_call_quality",
            "readability",
            "custom"
         ],
         "title": "MetricGroup",
         "type": "string"
      },
      "MetricThreshold": {
         "description": "The class that defines the threshold for a metric.",
         "properties": {
            "type": {
               "description": "Threshold type. One of 'lower_limit', 'upper_limit'",
               "enum": [
                  "lower_limit",
                  "upper_limit"
               ],
               "title": "Type",
               "type": "string"
            },
            "value": {
               "default": 0,
               "description": "The value of metric threshold",
               "title": "Threshold value",
               "type": "number"
            }
         },
         "required": [
            "type"
         ],
         "title": "MetricThreshold",
         "type": "object"
      },
      "TaskType": {
         "description": "Supported task types for generative AI models",
         "enum": [
            "question_answering",
            "classification",
            "summarization",
            "generation",
            "extraction",
            "retrieval_augmented_generation"
         ],
         "title": "TaskType",
         "type": "string"
      }
   }
}

Fields:

display_name (Annotated[Literal['Tool Call Syntactic Accuracy'], FieldInfo(annotation=NoneType, required=False, default='Tool Call Syntactic Accuracy', title='Display Name', description='The tool call syntactic accuracy metric display name.', frozen=True)])
group (Annotated[ibm_watsonx_gov.entities.enums.MetricGroup, FieldInfo(annotation=NoneType, required=False, default=
method (Annotated[Literal['syntactic_check'], FieldInfo(annotation=NoneType, required=False, default='syntactic_check', title='Computation Method', description='The method used to compute the metric.')])
name (Annotated[Literal['tool_call_syntactic_accuracy'], FieldInfo(annotation=NoneType, required=False, default='tool_call_syntactic_accuracy', title='Metric Name', description='The name of metric.')])
tasks (Annotated[list[ibm_watsonx_gov.entities.enums.TaskType], FieldInfo(annotation=NoneType, required=False, default=[
thresholds (Annotated[list[ibm_watsonx_gov.entities.metric_threshold.MetricThreshold], FieldInfo(annotation=NoneType, required=False, default=[MetricThreshold(type='lower_limit', value=0.7)], title='Metric threshold', description='Value that defines the violation limit for the metric')])

Validators:

field display_name: Annotated[Literal['Tool Call Syntactic Accuracy'], FieldInfo(annotation=NoneType, required=False, default='Tool Call Syntactic Accuracy', title='Display Name', description='The tool call syntactic accuracy metric display name.', frozen=True)] = 'Tool Call Syntactic Accuracy'¶

The tool call syntactic accuracy metric display name.

Validated by:

validate

field group: ', frozen=True)] = MetricGroup.TOOL_CALL_QUALITY¶

The metric group.

Validated by:

validate

field method: Annotated[Literal['syntactic_check'], FieldInfo(annotation=NoneType, required=False, default='syntactic_check', title='Computation Method', description='The method used to compute the metric.')] = 'syntactic_check'¶

The method used to compute the metric.

Validated by:

validate

field name: Annotated[Literal['tool_call_syntactic_accuracy'], FieldInfo(annotation=NoneType, required=False, default='tool_call_syntactic_accuracy', title='Metric Name', description='The name of metric.')] = 'tool_call_syntactic_accuracy'¶

The name of metric.

Validated by:

validate

field tasks: ')] = [TaskType.RAG]¶

The generative task type.

Validated by:

validate

field thresholds: Annotated[list[MetricThreshold], FieldInfo(annotation=NoneType, required=False, default=[MetricThreshold(type='lower_limit', value=0.7)], title='Metric threshold', description='Value that defines the violation limit for the metric')] = [MetricThreshold(type='lower_limit', value=0.7)]¶

Value that defines the violation limit for the metric

Validated by:

validate

evaluate(data: DataFrame | dict, configuration: GenAIConfiguration | AgenticAIConfiguration, **kwargs)¶

async evaluate_async(data: DataFrame | dict, configuration: GenAIConfiguration | AgenticAIConfiguration, **kwargs) → AggregateMetricResult¶

Evaluate the data for ToolCallSyntacticAccuracyMetric :param data: Data to be evaluated :type data: pd.DataFrame | dict :param configuration: Metrics configuration :type configuration: GenAIConfiguration | AgenticAIConfiguration

Returns:: The computed metrics
Return type:: AggregateMetricResult

model_post_init(context: Any, /) → None¶: We need to both initialize private attributes and call the user-defined model_post_init method.