ai4rag¶
RAG Templates Optimization Engine¶
ai4rag is an optimization engine for RAG (Retrieval-Augmented Generation) patterns that is LLM and Vector Database provider-agnostic. It accepts benchmark data, search space definition, optimizer configuration then returns a leaderboard with benchmarked RAG Template instances (called RAG Patterns).
Key Features¶
- Provider-agnostic: Works with any LLM and vector database: for more information please see Provider-agnostic section in User Guide
- Hyperparameter Optimization: Uses advanced HPO algorithms (GAM-based optimizer) to find optimal RAG configurations
- Comprehensive Evaluation: Built-in metrics for faithfulness, answer correctness, and context correctness
- Flexible Search Space: Define and constrain any RAG parameter (models, chunk sizes, retrieval methods, etc.)
- Event-Driven Architecture: Track experiment progress with custom event handlers
- Production Ready: Designed for real-world RAG optimization workflows
Quick Example¶
from ai4rag.core.experiment.experiment import AI4RAGExperiment
from ai4rag.core.hpo.gam_opt import GAMOptSettings
from ai4rag.search_space.src.search_space import AI4RAGSearchSpace
from ai4rag.utils.event_handler import LocalEventHandler
from pathlib import Path
# Define search space
search_space = AI4RAGSearchSpace(params=[...])
# Configure optimizer
optimizer_settings = GAMOptSettings(max_evals=10, n_random_nodes=4)
# Run experiment
experiment = AI4RAGExperiment(
documents=documents,
benchmark_data=benchmark_data,
search_space=search_space,
vector_store_type="chroma",
optimizer_settings=optimizer_settings,
event_handler=LocalEventHandler(
output_path=Path(__file__).parent / "ai4rag_results"
)
)
best_pattern = experiment.search()
How It Works¶
graph TB
A[Documents]
B[Benchmark Data]
C[Search Space Definition]
D[Experiment Engine]
E[HPO Optimizer]
subgraph X[RAG Pattern]
G[Chunking]
H[Embedding]
I[Vector Store]
J[Retrieval]
K[Generation]
end
M[Evaluation & Metrics Computation]
N[Best RAG Pattern]
O[Results Artifacts]
P[Events Callbacks]
A --> D
B --> D
C --> D
E <--> D
D --> X
G --> H
H --> I
I --> J
J --> K
X --> M
M --> E
D --> N
D --> O
D --> P - Documents and
benchmakr_data.jsonare prepared following desired schema - Search Space defines possible parameter combinations (models, chunk sizes, retrieval methods, etc.)
- Optimizer (optimization engine) explores configurations using an objective function with given RAG Template
- Evaluation of each configuration using selected metrics based on the
Evaluator(defaultunitxt) - Results are returned containing the optimal RAG Pattern with best performance
What's Included¶
Core Components¶
- Experiment Engine: Orchestrates the full optimization lifecycle
- Hyperparameter Optimizer: GAM-based optimization algorithm
- Search Space: Flexible parameter definition with validation rules
- Evaluator: metrics calculation (
faithfulness,answer_correctness,context_correctness)
RAG Components¶
- Foundation Model: LLM integration via
BaseFoundationModelinterface - Embedding Model: embedding model integration via
BaseEmbeddingModel - Vector Store: selected from supported ones (Milvus via Llama Stack and Chroma) or introduced by the user with
BaseVectorStoreinterface - Chunking: document splitting into smaller chunks
- Retrieval: simple and window-based retrieval strategies
- Templates: complete RAG implementations defined as a
RAGTemplate
Requirements¶
Llama Stack Integration
ai4rag works with a Llama Stack server. Tyo run experiment based on Llama Stack you will need:
- At least one foundation model (for text generation)
- At least one embedding model (for document embeddings)
- Vector database configured (e.g., Milvus) or locally used instance of Chroma
Getting Started¶
-
Installation
Install ai4rag with pip and set up your environment
-
Quick Start
Run your first RAG optimization in minutes
-
User Guide
Deep dive into search spaces, optimizers, and evaluation
-
API Reference
Complete API documentation for all components
Community and Support¶
- GitHub Repository: IBM/ai4rag
- Issue Tracker: Report bugs or request features
- Contributing: See our contribution guidelines
License¶
ai4rag is released under the Apache License 2.0. See LICENSE for details.
Copyright © 2025-2026 IBM Corp.