Skip to content

ai4rag

RAG Templates Optimization Engine

ai4RAG icon
AI4RAG Python 3.12 Python 3.13

ai4rag is an optimization engine for RAG (Retrieval-Augmented Generation) patterns that is LLM and Vector Database provider-agnostic. It accepts benchmark data, search space definition, optimizer configuration then returns a leaderboard with benchmarked RAG Template instances (called RAG Patterns).


Key Features

  • Provider-agnostic: Works with any LLM and vector database: for more information please see Provider-agnostic section in User Guide
  • Hyperparameter Optimization: Uses advanced HPO algorithms (GAM-based optimizer) to find optimal RAG configurations
  • Comprehensive Evaluation: Built-in metrics for faithfulness, answer correctness, and context correctness
  • Flexible Search Space: Define and constrain any RAG parameter (models, chunk sizes, retrieval methods, etc.)
  • Event-Driven Architecture: Track experiment progress with custom event handlers
  • Production Ready: Designed for real-world RAG optimization workflows

Quick Example

from ai4rag.core.experiment.experiment import AI4RAGExperiment
from ai4rag.core.hpo.gam_opt import GAMOptSettings
from ai4rag.search_space.src.search_space import AI4RAGSearchSpace
from ai4rag.utils.event_handler import LocalEventHandler
from pathlib import Path

# Define search space
search_space = AI4RAGSearchSpace(params=[...])

# Configure optimizer
optimizer_settings = GAMOptSettings(max_evals=10, n_random_nodes=4)

# Run experiment
experiment = AI4RAGExperiment(
    documents=documents,
    benchmark_data=benchmark_data,
    search_space=search_space,
    vector_store_type="chroma",
    optimizer_settings=optimizer_settings,
    event_handler=LocalEventHandler(
        output_path=Path(__file__).parent / "ai4rag_results"
    )
)

best_pattern = experiment.search()

How It Works

graph TB
    A[Documents]
    B[Benchmark Data]
    C[Search Space Definition]

    D[Experiment Engine]
    E[HPO Optimizer]

    subgraph X[RAG Pattern]
        G[Chunking]
        H[Embedding]
        I[Vector Store]
        J[Retrieval]
        K[Generation]
    end

    M[Evaluation & Metrics Computation]

    N[Best RAG Pattern]
    O[Results Artifacts]
    P[Events Callbacks]

    A --> D
    B --> D
    C --> D
    E <--> D
    D --> X
    G --> H
    H --> I
    I --> J
    J --> K
    X --> M
    M --> E
    D --> N
    D --> O
    D --> P
  1. Documents and benchmakr_data.json are prepared following desired schema
  2. Search Space defines possible parameter combinations (models, chunk sizes, retrieval methods, etc.)
  3. Optimizer (optimization engine) explores configurations using an objective function with given RAG Template
  4. Evaluation of each configuration using selected metrics based on the Evaluator (default unitxt)
  5. Results are returned containing the optimal RAG Pattern with best performance

What's Included

Core Components

  • Experiment Engine: Orchestrates the full optimization lifecycle
  • Hyperparameter Optimizer: GAM-based optimization algorithm
  • Search Space: Flexible parameter definition with validation rules
  • Evaluator: metrics calculation (faithfulness, answer_correctness, context_correctness)

RAG Components

  • Foundation Model: LLM integration via BaseFoundationModel interface
  • Embedding Model: embedding model integration via BaseEmbeddingModel
  • Vector Store: selected from supported ones (Milvus via Llama Stack and Chroma) or introduced by the user with BaseVectorStore interface
  • Chunking: document splitting into smaller chunks
  • Retrieval: simple and window-based retrieval strategies
  • Templates: complete RAG implementations defined as a RAGTemplate

Requirements

Llama Stack Integration

ai4rag works with a Llama Stack server. Tyo run experiment based on Llama Stack you will need:

  • At least one foundation model (for text generation)
  • At least one embedding model (for document embeddings)
  • Vector database configured (e.g., Milvus) or locally used instance of Chroma

Getting Started

  • Installation


    Install ai4rag with pip and set up your environment

    Installation Guide

  • Quick Start


    Run your first RAG optimization in minutes

    Quick Start

  • User Guide


    Deep dive into search spaces, optimizers, and evaluation

    User Guide

  • API Reference


    Complete API documentation for all components

    API Reference


Community and Support


License

ai4rag is released under the Apache License 2.0. See LICENSE for details.

Copyright © 2025-2026 IBM Corp.