ARES Configuration
This section describes how to configure ARES for different red-teaming scenarios. ARES uses modular YAML-based configuration files to define targets, goals, strategies, and evaluation logic.
This section describes the YAML configuration expected by the ARES CLI when running evaluations.
ARES expects a configuration file with two main nodes:
target: Defines the endpoint or model to be attacked.red-teaming: Specifies the red-teaming intent and attack prompts.
Basic Configuration
The following example uses the default ARES intent, which probes the target with direct request attack and evaluates responses using keyword matching:
target:
huggingface:
red-teaming:
prompts: assets/pii-seeds.csv
To use a predefined OWASP intent (e.g., owasp-llm-02), which aggregates related attack probes:
target:
huggingface:
red-teaming:
intent: owasp-llm-02
prompts: assets/pii-seeds.csv
Custom Intents
To define a custom intent, you must specify configurations for at least one of the three core ARES components:
goalstrategyevaluation
Use the ares show modules command to list all available modules:
ares show modules
Example structure for a custom intent:
target:
huggingface:
red-teaming:
intent: my-intent
prompts: assets/safety_behaviors_text_subset.csv
my-intent:
goal:
<goal configuration here>
strategy:
<strategy configuration here>
evaluation:
<evaluation configuration here>
You don’t need to define all three nodes. If a node is omitted, ARES will automatically use defaults based on the selected intent. Define only the nodes relevant to your use case.
To list available implementations for each module:
ares show connectors
ares show goals
ares show strategies
ares show evals
To view template configurations for a specific module:
ares show strategies -n <strategy_name>
ares show evals -n keyword
Verbose Mode
To inspect the full configuration used during evaluation, use the -v or --verbose flag:
ares evaluate minimal.yaml -v
ares evaluate minimal.yaml --verbose
Example: Custom Strategy
This example defines a custom intent my-intent with a user-defined strategy my_direct_request:
target:
huggingface:
red-teaming:
intent: my-intent
prompts: assets/safety_behaviors_text_subset.csv
my-intent:
strategy:
my_direct_request:
type: ares.strategies.direct_requests.DirectRequests
input_path: assets/attack_goals.json
output_path: assets/attack_attacks.json
More example YAML files can be found in the example_configs/ directory.