Search Space¶

The search space defines all possible RAG configurations that the optimizer can explore. It specifies which parameters to optimize, what values they can take, and the rules that govern valid combinations.

What Is a Search Space?¶

A search space is a collection of parameters that define a RAG configuration:

from ai4rag.search_space.src.search_space import AI4RAGSearchSpace
from ai4rag.search_space.src.parameter import Parameter

search_space = AI4RAGSearchSpace(
    params=[
        Parameter(name="chunk_size", param_type="C", values=[512, 1024, 2048]),
        Parameter(name="chunk_overlap", param_type="C", values=[64, 128, 256]),
        Parameter(name="retrieval_method", param_type="C", values=["simple", "window"]),
    ]
)

This search space has 3 × 3 × 2 = 18 possible combinations (before validation rules).

The optimizer explores this space to find the combination that maximizes your chosen evaluation metric.

Parameter Types¶

ai4rag supports four parameter types:

Type	Code	Description	Example
Categorical	`"C"`	Discrete set of values (strings, objects)	`["recursive", "markdown"]`
Integer	`"I"`	Integer range with min/max	`v_min=100, v_max=1000`
Real	`"R"`	Continuous range (float)	`v_min=0.0, v_max=1.0`
Boolean	`"B"`	True or False	`values=[True, False]`

Categorical Parameters (`"C"`)¶

Define a discrete set of possible values.

Common use cases:

Model selection
Method choices (chunking method, retrieval method, ranker strategy)
Discrete numeric values (chunk sizes, number of chunks)

Example 1: String values

Parameter(
    name="chunking_method",
    param_type="C",
    values=["recursive", "markdown", "markdown_header"]
)

Example 2: Numeric values

Parameter(
    name="chunk_size",
    param_type="C",
    values=[200, 400, 800, 1000, 2048]
)

Example 3: Model objects

from ai4rag.rag.foundation_models.llama_stack import LSFoundationModel

Parameter(
    name="foundation_model",
    param_type="C",
    values=[
        LSFoundationModel(model_id="ollama/llama3.2:3b", client=client),
        LSFoundationModel(model_id="ollama/llama3.1:8b", client=client),
    ]
)

Categorical for Discrete Numerics

Even for numeric parameters like chunk_size, use Categorical ("C") when you want to test specific values rather than a continuous range. This gives you more control over which values are tested.

Integer Parameters (`"I"`)¶

Define an integer range with minimum and maximum bounds.

Syntax:

Parameter(
    name="parameter_name",
    param_type="I",
    v_min=100,    # Minimum value (inclusive)
    v_max=1000    # Maximum value (inclusive)
)

Example:

Parameter(
    name="chunk_size",
    param_type="I",
    v_min=200,
    v_max=2048
)
# Generates: [200, 201, 202, ..., 2048]

Large Integer Ranges

Be cautious with large ranges. v_min=100, v_max=5000 creates 4,901 possible values, which exponentially increases search space size when combined with other parameters. Consider using Categorical with specific values instead.

Real Parameters (`"R"`)¶

Define a continuous floating-point range.

Syntax:

Parameter(
    name="parameter_name",
    param_type="R",
    v_min=0.0,
    v_max=1.0
)

Example:

Parameter(
    name="ranker_alpha",
    param_type="R",
    v_min=0.0,
    v_max=1.0
)

Real Type Not Fully Supported

Currently, Real parameters cannot be enumerated (no .all_values() method). For practical optimization, use Categorical with discrete float values instead:

Parameter(name="ranker_alpha", param_type="C", values=[0.0, 0.3, 0.5, 0.7, 1.0])

Boolean Parameters (`"B"`)¶

True/False parameters.

Syntax:

Parameter(
    name="parameter_name",
    param_type="B",
    values=[True, False]
)

Example:

Parameter(
    name="include_chunk_metadata",
    param_type="B",
    values=[True, False]
)

Required Parameters¶

Two parameters are always required in an AI4RAGSearchSpace:

1. `foundation_model`¶

The LLM used for text generation.

Example (Llama Stack):

from ai4rag.rag.foundation_models.llama_stack import LSFoundationModel

Parameter(
    name="foundation_model",
    param_type="C",
    values=[
        LSFoundationModel(model_id="ollama/llama3.2:3b", client=client)
    ]
)

Example (OpenAI-compatible):

from ai4rag.rag.foundation_models.openai_model import OpenAIFoundationModel
from openai import OpenAI

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

Parameter(
    name="foundation_model",
    param_type="C",
    values=[
        OpenAIFoundationModel(model_id="gpt-4o-mini", client=client, params={})
    ]
)

2. `embedding_model`¶

The model used for generating document and query embeddings.

Example (Llama Stack):

from ai4rag.rag.embedding.llama_stack import LSEmbeddingModel

Parameter(
    name="embedding_model",
    param_type="C",
    values=[
        LSEmbeddingModel(
            model_id="ollama/nomic-embed-text:latest",
            client=client,
            params={"embedding_dimension": 768, "context_length": 8192}
        )
    ]
)

Example (OpenAI-compatible):

from ai4rag.rag.embedding.openai_model import OpenAIEmbeddingModel
from openai import OpenAI

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

Parameter(
    name="embedding_model",
    param_type="C",
    values=[
        OpenAIEmbeddingModel(
            model_id="text-embedding-3-small",
            client=client,
            params={"embedding_dimension": 1536, "context_length": 8191}
        )
    ]
)

Embedding Model params

The params dict should include: - embedding_dimension: Vector size (e.g., 768, 1536) - context_length: Maximum tokens the model can process (used for validation)

Default Parameters¶

If you don't specify certain parameters, AI4RAGSearchSpace uses sensible defaults. These defaults differ slightly between ChromaDB and Llama Stack vector stores.

Default Values¶

Parameter	Default (Llama Stack)	Default (ChromaDB)	Type
`chunking_method`	`("recursive",)`	`("recursive",)`	Categorical
`chunk_size`	`(1024, 2048)`	`(1024, 2048)`	Categorical
`chunk_overlap`	`(128, 256)`	`(128, 256)`	Categorical
`retrieval_method`	`("simple",)`	`("simple", "window")`	Categorical
`window_size`	`(0,)`	`(0, 1, 3, 5)`	Categorical
`number_of_chunks`	`(3, 5, 10)`	`(3, 5, 10)`	Categorical
`search_mode`	`("vector", "hybrid")`	`("vector",)`	Categorical
`ranker_strategy`	`("", "rrf", "weighted")`	N/A	Categorical
`ranker_k`	`(0, 60)`	N/A	Categorical
`ranker_alpha`	`(1, 0.5)`	N/A	Categorical

Why Different Defaults?

ChromaDB doesn't support hybrid search, so search_mode is fixed to "vector" and ranker parameters are excluded
ChromaDB defaults include window retrieval options since it's an in-memory store (faster experimentation)
Llama Stack defaults focus on simple retrieval but include hybrid search exploration

Overriding Defaults¶

User-provided parameters always override defaults:

from ai4rag.search_space.src.search_space import AI4RAGSearchSpace
from ai4rag.search_space.src.parameter import Parameter

search_space = AI4RAGSearchSpace(
    params=[
        # Required parameters
        Parameter(name="foundation_model", param_type="C", values=[model]),
        Parameter(name="embedding_model", param_type="C", values=[embedding]),

        # Override default chunk_size
        Parameter(name="chunk_size", param_type="C", values=[512, 1024, 1536]),

        # chunk_overlap, retrieval_method, etc. will use defaults
    ]
)

Validation Rules¶

AI4RAGSearchSpace enforces built-in validation rules to filter out invalid parameter combinations.

Rule 1: Chunk Size Greater Than Chunk Overlap¶

Rule: chunk_size > 2 * chunk_overlap

Why: Text chunkers need enough non-overlapping content to create meaningful chunks.

Example:

# Valid
{"chunk_size": 1024, "chunk_overlap": 256}  # 1024 > 2*256 = 512 ✓

# Invalid (filtered out)
{"chunk_size": 512, "chunk_overlap": 300}   # 512 > 2*300 = 600 ✗

Exception: chunk_size = 0

When chunk_size is 0 (used for markdown_header structural-only splitting), this rule is skipped.

Rule 2: Retrieval Method and Window Size Consistency¶

Rule:

When retrieval_method == "simple", window_size must be 0
When retrieval_method == "window", window_size must be > 0

Why: Window retrieval requires a non-zero window; simple retrieval doesn't use windows.

Example:

# Valid
{"retrieval_method": "simple", "window_size": 0}    ✓
{"retrieval_method": "window", "window_size": 3}    ✓

# Invalid (filtered out)
{"retrieval_method": "simple", "window_size": 2}    ✗
{"retrieval_method": "window", "window_size": 0}    ✗

Rule 3: Chunk Size Within Embedding Context Length¶

Rule: Estimated token count of chunk_size must fit within the embedding model's context_length.

How it works: Uses a conservative ratio of 3.6 characters per token to estimate tokens:

estimated_tokens = chunk_size / 3.6

# Example
chunk_size = 1024
estimated_tokens = 1024 / 3.6 ≈ 284 tokens

# Check: 284 <= context_length (e.g., 8192) ✓

Why: If chunks exceed the embedding model's context window, embedding generation will fail or be truncated.

Example:

# Embedding model with context_length = 512
embedding = LSEmbeddingModel(
    model_id="small-embedder",
    params={"context_length": 512, "embedding_dimension": 384}
)

# Valid
{"chunk_size": 1024, "embedding_model": embedding}  # ~284 tokens ✓

# Invalid (filtered out)
{"chunk_size": 2048, "embedding_model": embedding}  # ~569 tokens ✗ (exceeds 512)

Rule 4: Hybrid Search Ranker Consistency¶

Rule: Ranker parameters must only be set when search_mode == "hybrid".

Sub-rules:

When search_mode == "vector":
ranker_strategy must be "" (empty string)
ranker_k must be 0
ranker_alpha must be 1 (sentinel for vector-only)
When search_mode == "hybrid":
ranker_strategy must be a non-empty string ("rrf", "weighted", or "normalized")

Example:

# Valid: vector mode
{"search_mode": "vector", "ranker_strategy": "", "ranker_k": 0, "ranker_alpha": 1}    ✓

# Valid: hybrid mode
{"search_mode": "hybrid", "ranker_strategy": "rrf", "ranker_k": 60, "ranker_alpha": 1}    ✓

# Invalid (filtered out)
{"search_mode": "vector", "ranker_strategy": "rrf", "ranker_k": 60, "ranker_alpha": 1}   ✗

Rule 5: ranker_k Only for RRF¶

Rule:

When ranker_strategy == "rrf", ranker_k must be > 0
When ranker_strategy != "rrf", ranker_k must be 0 (sentinel)

Example:

# Valid
{"ranker_strategy": "rrf", "ranker_k": 60}         ✓
{"ranker_strategy": "weighted", "ranker_k": 0}     ✓

# Invalid (filtered out)
{"ranker_strategy": "rrf", "ranker_k": 0}          ✗
{"ranker_strategy": "weighted", "ranker_k": 60}    ✗

Rule 6: ranker_alpha Only for Weighted¶

Rule:

When ranker_strategy == "weighted", ranker_alpha must be != 1 (valid range: 0 to <1)
When ranker_strategy != "weighted", ranker_alpha must be 1 (sentinel for 100% dense / vector-only)

Example:

# Valid
{"ranker_strategy": "weighted", "ranker_alpha": 0.7}    ✓
{"ranker_strategy": "rrf", "ranker_alpha": 1}           ✓

# Invalid (filtered out)
{"ranker_strategy": "weighted", "ranker_alpha": 1}      ✗
{"ranker_strategy": "rrf", "ranker_alpha": 0.5}         ✗

Sentinel Values¶

Sentinel values are special placeholder values used to indicate "not applicable" for optional parameters:

Parameter	Sentinel Value	Meaning
`ranker_strategy`	`""` (empty string)	No ranker (vector-only search)
`ranker_k`	`0`	Parameter not used
`ranker_alpha`	`1`	100% dense / vector-only (not applicable)

When search_mode == "vector", all ranker parameters must be sentinels to indicate they're unused.

Custom Validation Rules¶

You can add custom rules beyond the built-in ones:

from ai4rag.search_space.src.search_space import AI4RAGSearchSpace
from ai4rag.search_space.src.parameter import Parameter

def my_custom_rule(combination: dict) -> bool:
    """Only allow large chunks with window retrieval."""
    if combination["retrieval_method"] == "window":
        return combination["chunk_size"] >= 1024
    return True

search_space = AI4RAGSearchSpace(
    params=[
        # ... parameters
    ],
    rules=[my_custom_rule]  # Add custom rules here
)

Custom rule requirements:

Function signature: def rule_name(combination: dict) -> bool
Return True if the combination is valid, False to filter it out
combination is a dict with all parameter names as keys

Example use cases:

Domain-specific constraints (e.g., "small models need smaller chunks")
Cost constraints (e.g., "don't use expensive models with large retrieval")
Performance requirements (e.g., "window retrieval only with chunk_size > 512")

Search Space Size¶

The max_combinations property tells you how many valid configurations exist:

search_space = AI4RAGSearchSpace(
    params=[
        Parameter(name="chunk_size", param_type="C", values=[512, 1024, 2048]),
        Parameter(name="chunk_overlap", param_type="C", values=[64, 128, 256]),
        Parameter(name="retrieval_method", param_type="C", values=["simple", "window"]),
        # ... other params
    ]
)

print(f"Search space size: {search_space.max_combinations}")
# Output: Search space size: 42 (after validation rules filter invalid combos)

Estimating Optimizer Settings

Use max_combinations to guide your optimizer settings: - For spaces with <20 combinations: max_evals=10 - For spaces with 20-100 combinations: max_evals=15-25 - For spaces with 100+ combinations: max_evals=30-50

Code Examples¶

Example 1: Minimal Search Space¶

Optimize chunking and retrieval with fixed models:

from ai4rag.search_space.src.search_space import AI4RAGSearchSpace
from ai4rag.search_space.src.parameter import Parameter
from ai4rag.rag.foundation_models.llama_stack import LSFoundationModel
from ai4rag.rag.embedding.llama_stack import LSEmbeddingModel

search_space = AI4RAGSearchSpace(
    params=[
        # Required: models (not optimized, just fixed)
        Parameter(
            name="foundation_model",
            param_type="C",
            values=[LSFoundationModel(model_id="ollama/llama3.2:3b", client=client)]
        ),
        Parameter(
            name="embedding_model",
            param_type="C",
            values=[
                LSEmbeddingModel(
                    model_id="ollama/nomic-embed-text:latest",
                    client=client,
                    params={"embedding_dimension": 768, "context_length": 8192}
                )
            ]
        ),

        # Optimize chunking
        Parameter(name="chunk_size", param_type="C", values=[512, 1024, 2048]),
        Parameter(name="chunk_overlap", param_type="C", values=[64, 128, 256]),

        # Optimize retrieval
        Parameter(name="number_of_chunks", param_type="C", values=[3, 5, 7, 10]),
    ]
)

# Other parameters (retrieval_method, window_size, etc.) will use defaults

Example 2: Comprehensive Search Space¶

Explore chunking methods, retrieval strategies, and hybrid search:

from ai4rag.search_space.src.search_space import AI4RAGSearchSpace
from ai4rag.search_space.src.parameter import Parameter

search_space = AI4RAGSearchSpace(
    params=[
        # Models
        Parameter(name="foundation_model", param_type="C", values=[model]),
        Parameter(name="embedding_model", param_type="C", values=[embedding]),

        # Chunking
        Parameter(name="chunking_method", param_type="C", values=["recursive", "markdown"]),
        Parameter(name="chunk_size", param_type="C", values=[512, 1024, 1536]),
        Parameter(name="chunk_overlap", param_type="C", values=[64, 128, 200]),

        # Retrieval
        Parameter(name="retrieval_method", param_type="C", values=["simple", "window"]),
        Parameter(name="window_size", param_type="C", values=[0, 1, 3, 5]),
        Parameter(name="number_of_chunks", param_type="C", values=[5, 7, 10]),

        # Hybrid search
        Parameter(name="search_mode", param_type="C", values=["vector", "hybrid"]),
        Parameter(name="ranker_strategy", param_type="C", values=["", "rrf", "weighted"]),
        Parameter(name="ranker_k", param_type="C", values=[0, 30, 60, 100]),
        Parameter(name="ranker_alpha", param_type="C", values=[1, 0.3, 0.5, 0.7]),
    ],
    vector_store_type="ls_milvus"  # Required for hybrid search
)

Example 3: Model Comparison¶

Compare different foundation models with fixed parameters:

search_space = AI4RAGSearchSpace(
    params=[
        # Optimize model selection
        Parameter(
            name="foundation_model",
            param_type="C",
            values=[
                LSFoundationModel(model_id="ollama/llama3.2:3b", client=client),
                LSFoundationModel(model_id="ollama/llama3.1:8b", client=client),
                LSFoundationModel(model_id="ollama/mistral:7b", client=client),
            ]
        ),

        # Fixed embedding and parameters
        Parameter(name="embedding_model", param_type="C", values=[embedding]),
        Parameter(name="chunk_size", param_type="C", values=[1024]),
        Parameter(name="chunk_overlap", param_type="C", values=[128]),
        Parameter(name="number_of_chunks", param_type="C", values=[5]),
    ]
)

Example 4: Custom Rule for Budget Constraints¶

Prevent expensive model + large retrieval combinations:

def budget_constraint(combination: dict) -> bool:
    """Don't use large models with high retrieval counts."""
    model_id = str(combination["foundation_model"])
    num_chunks = combination["number_of_chunks"]

    # Large models (8b+) should use fewer chunks
    if "8b" in model_id or "13b" in model_id:
        return num_chunks <= 5
    return True

search_space = AI4RAGSearchSpace(
    params=[
        Parameter(
            name="foundation_model",
            param_type="C",
            values=[
                LSFoundationModel(model_id="ollama/llama3.2:3b", client=client),
                LSFoundationModel(model_id="ollama/llama3.1:8b", client=client),
            ]
        ),
        Parameter(name="embedding_model", param_type="C", values=[embedding]),
        Parameter(name="number_of_chunks", param_type="C", values=[3, 5, 7, 10]),
    ],
    rules=[budget_constraint]
)

Best Practices¶

1. Start Small, Expand Gradually¶

Begin with a narrow search space to quickly find a baseline:

# Iteration 1: Minimal space
search_space = AI4RAGSearchSpace(
    params=[
        # ... models
        Parameter(name="chunk_size", param_type="C", values=[1024]),  # Fixed
        Parameter(name="number_of_chunks", param_type="C", values=[5, 10]),  # Optimize
    ]
)

Once you understand which parameters matter, expand:

# Iteration 2: Expanded space
search_space = AI4RAGSearchSpace(
    params=[
        # ... models
        Parameter(name="chunk_size", param_type="C", values=[512, 1024, 2048]),  # Now optimize
        Parameter(name="chunk_overlap", param_type="C", values=[64, 128]),  # Add overlap
        Parameter(name="number_of_chunks", param_type="C", values=[5, 7, 10]),  # Expand
    ]
)

2. Use Categorical for Most Parameters¶

Even for numeric values, prefer Categorical over Integer ranges:

# Preferred
Parameter(name="chunk_size", param_type="C", values=[200, 400, 800, 1600])

# Avoid (creates 1401 values!)
Parameter(name="chunk_size", param_type="I", v_min=200, v_max=1600)

3. Respect Validation Rules¶

Check your search space size to ensure rules aren't over-filtering:

search_space = AI4RAGSearchSpace(params=[...])

print(f"Valid combinations: {search_space.max_combinations}")

# If this is 0 or suspiciously low, check your parameter values

4. Avoid Overly Large Spaces¶

Search space grows multiplicatively. With 5 parameters, each with 4 values:

4 × 4 × 4 × 4 × 4 = 1,024 combinations

Keep individual parameter value counts reasonable (3-5 values per parameter).

5. Document Your Search Space¶

Add comments explaining your choices:

search_space = AI4RAGSearchSpace(
    params=[
        # Using 3b model for speed; 8b is too slow for our budget
        Parameter(name="foundation_model", param_type="C", values=[model_3b]),

        # Chunk sizes optimized for technical documentation (shorter paragraphs)
        Parameter(name="chunk_size", param_type="C", values=[256, 512, 1024]),

        # Higher retrieval counts since documents are highly fragmented
        Parameter(name="number_of_chunks", param_type="C", values=[7, 10, 15]),
    ]
)

Troubleshooting¶

Error: "Missing required parameters"¶

Cause: You didn't provide foundation_model or embedding_model.

Solution: Always include both required parameters:

search_space = AI4RAGSearchSpace(
    params=[
        Parameter(name="foundation_model", param_type="C", values=[foundation_model]),
        Parameter(name="embedding_model", param_type="C", values=[embedding_model]),
        # ... other params
    ]
)

Error: "Not supported parameters were given"¶

Cause: You provided a parameter name that's not in AI4RAGParamNames.

Solution: Check the parameter name spelling and consult ai4rag.utils.constants.AI4RAGParamNames for valid names.

Search Space Size Is Zero¶

Cause: Validation rules are filtering out all combinations.

Solution:

Check parameter value combinations manually
Verify chunk_size > 2 * chunk_overlap for all combinations
Ensure retrieval_method and window_size are consistent
For hybrid search, verify ranker parameter sentinels

Optimizer Says "max_evals exceeded combinations"¶

Cause: You set max_evals higher than your search space size.

Solution: Reduce max_evals or expand your search space:

search_space = AI4RAGSearchSpace(params=[...])
print(f"Max combinations: {search_space.max_combinations}")

# Set max_evals <= max_combinations
optimizer_settings = GAMOptSettings(max_evals=15)

Optimizers: How optimizers explore the search space
Evaluation: Metrics used to score configurations
Hybrid Search: Configuring hybrid search parameters
Quick Start: Complete search space examples

Summary¶

Search spaces in ai4rag:

Define parameters: Specify what to optimize and possible values
Four types: Categorical ("C"), Integer ("I"), Real ("R"), Boolean ("B")
Required: foundation_model and embedding_model must always be provided
Defaults: Sensible defaults for chunking, retrieval, and search parameters
Validation rules: Built-in rules filter invalid combinations automatically
Sentinel values: "", 0, and 1 indicate unused parameters
Custom rules: Add domain-specific constraints with custom validation functions

Start with a minimal search space, optimize the most impactful parameters first (chunking and retrieval), then expand to explore advanced features like hybrid search and model comparison.

Search Space¶

What Is a Search Space?¶

Parameter Types¶

Categorical Parameters ("C")¶

Integer Parameters ("I")¶

Real Parameters ("R")¶

Boolean Parameters ("B")¶

Required Parameters¶

1. foundation_model¶

2. embedding_model¶

Default Parameters¶

Default Values¶

Overriding Defaults¶

Validation Rules¶

Rule 1: Chunk Size Greater Than Chunk Overlap¶

Rule 2: Retrieval Method and Window Size Consistency¶

Rule 3: Chunk Size Within Embedding Context Length¶

Rule 4: Hybrid Search Ranker Consistency¶

Rule 5: ranker_k Only for RRF¶

Rule 6: ranker_alpha Only for Weighted¶

Sentinel Values¶

Custom Validation Rules¶

Search Space Size¶

Code Examples¶

Example 1: Minimal Search Space¶

Example 2: Comprehensive Search Space¶

Example 3: Model Comparison¶

Example 4: Custom Rule for Budget Constraints¶

Best Practices¶

1. Start Small, Expand Gradually¶

2. Use Categorical for Most Parameters¶

3. Respect Validation Rules¶

4. Avoid Overly Large Spaces¶

5. Document Your Search Space¶

Troubleshooting¶

Error: "Missing required parameters"¶

Error: "Not supported parameters were given"¶

Search Space Size Is Zero¶

Optimizer Says "max_evals exceeded combinations"¶

Related Topics¶

Summary¶

Categorical Parameters (`"C"`)¶

Integer Parameters (`"I"`)¶

Real Parameters (`"R"`)¶

Boolean Parameters (`"B"`)¶

1. `foundation_model`¶

2. `embedding_model`¶