Search Space¶
The search space defines all possible RAG configurations that the optimizer can explore. It specifies which parameters to optimize, what values they can take, and the rules that govern valid combinations.
What Is a Search Space?¶
A search space is a collection of parameters that define a RAG configuration:
from ai4rag.search_space.src.search_space import AI4RAGSearchSpace
from ai4rag.search_space.src.parameter import Parameter
search_space = AI4RAGSearchSpace(
params=[
Parameter(name="chunk_size", param_type="C", values=[512, 1024, 2048]),
Parameter(name="chunk_overlap", param_type="C", values=[64, 128, 256]),
Parameter(name="retrieval_method", param_type="C", values=["simple", "window"]),
]
)
This search space has 3 × 3 × 2 = 18 possible combinations (before validation rules).
The optimizer explores this space to find the combination that maximizes your chosen evaluation metric.
Parameter Types¶
ai4rag supports four parameter types:
| Type | Code | Description | Example |
|---|---|---|---|
| Categorical | "C" | Discrete set of values (strings, objects) | ["recursive", "markdown"] |
| Integer | "I" | Integer range with min/max | v_min=100, v_max=1000 |
| Real | "R" | Continuous range (float) | v_min=0.0, v_max=1.0 |
| Boolean | "B" | True or False | values=[True, False] |
Categorical Parameters ("C")¶
Define a discrete set of possible values.
Common use cases:
- Model selection
- Method choices (chunking method, retrieval method, ranker strategy)
- Discrete numeric values (chunk sizes, number of chunks)
Example 1: String values
Parameter(
name="chunking_method",
param_type="C",
values=["recursive", "markdown", "markdown_header"]
)
Example 2: Numeric values
Example 3: Model objects
from ai4rag.rag.foundation_models.llama_stack import LSFoundationModel
Parameter(
name="foundation_model",
param_type="C",
values=[
LSFoundationModel(model_id="ollama/llama3.2:3b", client=client),
LSFoundationModel(model_id="ollama/llama3.1:8b", client=client),
]
)
Categorical for Discrete Numerics
Even for numeric parameters like chunk_size, use Categorical ("C") when you want to test specific values rather than a continuous range. This gives you more control over which values are tested.
Integer Parameters ("I")¶
Define an integer range with minimum and maximum bounds.
Syntax:
Parameter(
name="parameter_name",
param_type="I",
v_min=100, # Minimum value (inclusive)
v_max=1000 # Maximum value (inclusive)
)
Example:
Parameter(
name="chunk_size",
param_type="I",
v_min=200,
v_max=2048
)
# Generates: [200, 201, 202, ..., 2048]
Large Integer Ranges
Be cautious with large ranges. v_min=100, v_max=5000 creates 4,901 possible values, which exponentially increases search space size when combined with other parameters. Consider using Categorical with specific values instead.
Real Parameters ("R")¶
Define a continuous floating-point range.
Syntax:
Example:
Real Type Not Fully Supported
Currently, Real parameters cannot be enumerated (no .all_values() method). For practical optimization, use Categorical with discrete float values instead:
Boolean Parameters ("B")¶
True/False parameters.
Syntax:
Example:
Required Parameters¶
Two parameters are always required in an AI4RAGSearchSpace:
1. foundation_model¶
The LLM used for text generation.
Example (Llama Stack):
from ai4rag.rag.foundation_models.llama_stack import LSFoundationModel
Parameter(
name="foundation_model",
param_type="C",
values=[
LSFoundationModel(model_id="ollama/llama3.2:3b", client=client)
]
)
Example (OpenAI-compatible):
from ai4rag.rag.foundation_models.openai_model import OpenAIFoundationModel
from openai import OpenAI
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
Parameter(
name="foundation_model",
param_type="C",
values=[
OpenAIFoundationModel(model_id="gpt-4o-mini", client=client, params={})
]
)
2. embedding_model¶
The model used for generating document and query embeddings.
Example (Llama Stack):
from ai4rag.rag.embedding.llama_stack import LSEmbeddingModel
Parameter(
name="embedding_model",
param_type="C",
values=[
LSEmbeddingModel(
model_id="ollama/nomic-embed-text:latest",
client=client,
params={"embedding_dimension": 768, "context_length": 8192}
)
]
)
Example (OpenAI-compatible):
from ai4rag.rag.embedding.openai_model import OpenAIEmbeddingModel
from openai import OpenAI
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
Parameter(
name="embedding_model",
param_type="C",
values=[
OpenAIEmbeddingModel(
model_id="text-embedding-3-small",
client=client,
params={"embedding_dimension": 1536, "context_length": 8191}
)
]
)
Embedding Model params
The params dict should include: - embedding_dimension: Vector size (e.g., 768, 1536) - context_length: Maximum tokens the model can process (used for validation)
Default Parameters¶
If you don't specify certain parameters, AI4RAGSearchSpace uses sensible defaults. These defaults differ slightly between ChromaDB and Llama Stack vector stores.
Default Values¶
| Parameter | Default (Llama Stack) | Default (ChromaDB) | Type |
|---|---|---|---|
chunking_method | ("recursive",) | ("recursive",) | Categorical |
chunk_size | (1024, 2048) | (1024, 2048) | Categorical |
chunk_overlap | (128, 256) | (128, 256) | Categorical |
retrieval_method | ("simple",) | ("simple", "window") | Categorical |
window_size | (0,) | (0, 1, 3, 5) | Categorical |
number_of_chunks | (3, 5, 10) | (3, 5, 10) | Categorical |
search_mode | ("vector", "hybrid") | ("vector",) | Categorical |
ranker_strategy | ("", "rrf", "weighted") | N/A | Categorical |
ranker_k | (0, 60) | N/A | Categorical |
ranker_alpha | (1, 0.5) | N/A | Categorical |
Why Different Defaults?
- ChromaDB doesn't support hybrid search, so
search_modeis fixed to"vector"and ranker parameters are excluded - ChromaDB defaults include window retrieval options since it's an in-memory store (faster experimentation)
- Llama Stack defaults focus on simple retrieval but include hybrid search exploration
Overriding Defaults¶
User-provided parameters always override defaults:
from ai4rag.search_space.src.search_space import AI4RAGSearchSpace
from ai4rag.search_space.src.parameter import Parameter
search_space = AI4RAGSearchSpace(
params=[
# Required parameters
Parameter(name="foundation_model", param_type="C", values=[model]),
Parameter(name="embedding_model", param_type="C", values=[embedding]),
# Override default chunk_size
Parameter(name="chunk_size", param_type="C", values=[512, 1024, 1536]),
# chunk_overlap, retrieval_method, etc. will use defaults
]
)
Validation Rules¶
AI4RAGSearchSpace enforces built-in validation rules to filter out invalid parameter combinations.
Rule 1: Chunk Size Greater Than Chunk Overlap¶
Rule: chunk_size > 2 * chunk_overlap
Why: Text chunkers need enough non-overlapping content to create meaningful chunks.
Example:
# Valid
{"chunk_size": 1024, "chunk_overlap": 256} # 1024 > 2*256 = 512 ✓
# Invalid (filtered out)
{"chunk_size": 512, "chunk_overlap": 300} # 512 > 2*300 = 600 ✗
Exception: chunk_size = 0
When chunk_size is 0 (used for markdown_header structural-only splitting), this rule is skipped.
Rule 2: Retrieval Method and Window Size Consistency¶
Rule:
- When
retrieval_method == "simple",window_sizemust be0 - When
retrieval_method == "window",window_sizemust be> 0
Why: Window retrieval requires a non-zero window; simple retrieval doesn't use windows.
Example:
# Valid
{"retrieval_method": "simple", "window_size": 0} ✓
{"retrieval_method": "window", "window_size": 3} ✓
# Invalid (filtered out)
{"retrieval_method": "simple", "window_size": 2} ✗
{"retrieval_method": "window", "window_size": 0} ✗
Rule 3: Chunk Size Within Embedding Context Length¶
Rule: Estimated token count of chunk_size must fit within the embedding model's context_length.
How it works: Uses a conservative ratio of 3.6 characters per token to estimate tokens:
estimated_tokens = chunk_size / 3.6
# Example
chunk_size = 1024
estimated_tokens = 1024 / 3.6 ≈ 284 tokens
# Check: 284 <= context_length (e.g., 8192) ✓
Why: If chunks exceed the embedding model's context window, embedding generation will fail or be truncated.
Example:
# Embedding model with context_length = 512
embedding = LSEmbeddingModel(
model_id="small-embedder",
params={"context_length": 512, "embedding_dimension": 384}
)
# Valid
{"chunk_size": 1024, "embedding_model": embedding} # ~284 tokens ✓
# Invalid (filtered out)
{"chunk_size": 2048, "embedding_model": embedding} # ~569 tokens ✗ (exceeds 512)
Rule 4: Hybrid Search Ranker Consistency¶
Rule: Ranker parameters must only be set when search_mode == "hybrid".
Sub-rules:
- When
search_mode == "vector": ranker_strategymust be""(empty string)ranker_kmust be0-
ranker_alphamust be1(sentinel for vector-only) -
When
search_mode == "hybrid": ranker_strategymust be a non-empty string ("rrf","weighted", or"normalized")
Example:
# Valid: vector mode
{"search_mode": "vector", "ranker_strategy": "", "ranker_k": 0, "ranker_alpha": 1} ✓
# Valid: hybrid mode
{"search_mode": "hybrid", "ranker_strategy": "rrf", "ranker_k": 60, "ranker_alpha": 1} ✓
# Invalid (filtered out)
{"search_mode": "vector", "ranker_strategy": "rrf", "ranker_k": 60, "ranker_alpha": 1} ✗
Rule 5: ranker_k Only for RRF¶
Rule:
- When
ranker_strategy == "rrf",ranker_kmust be> 0 - When
ranker_strategy != "rrf",ranker_kmust be0(sentinel)
Example:
# Valid
{"ranker_strategy": "rrf", "ranker_k": 60} ✓
{"ranker_strategy": "weighted", "ranker_k": 0} ✓
# Invalid (filtered out)
{"ranker_strategy": "rrf", "ranker_k": 0} ✗
{"ranker_strategy": "weighted", "ranker_k": 60} ✗
Rule 6: ranker_alpha Only for Weighted¶
Rule:
- When
ranker_strategy == "weighted",ranker_alphamust be!= 1(valid range: 0 to <1) - When
ranker_strategy != "weighted",ranker_alphamust be1(sentinel for 100% dense / vector-only)
Example:
# Valid
{"ranker_strategy": "weighted", "ranker_alpha": 0.7} ✓
{"ranker_strategy": "rrf", "ranker_alpha": 1} ✓
# Invalid (filtered out)
{"ranker_strategy": "weighted", "ranker_alpha": 1} ✗
{"ranker_strategy": "rrf", "ranker_alpha": 0.5} ✗
Sentinel Values¶
Sentinel values are special placeholder values used to indicate "not applicable" for optional parameters:
| Parameter | Sentinel Value | Meaning |
|---|---|---|
ranker_strategy | "" (empty string) | No ranker (vector-only search) |
ranker_k | 0 | Parameter not used |
ranker_alpha | 1 | 100% dense / vector-only (not applicable) |
When search_mode == "vector", all ranker parameters must be sentinels to indicate they're unused.
Custom Validation Rules¶
You can add custom rules beyond the built-in ones:
from ai4rag.search_space.src.search_space import AI4RAGSearchSpace
from ai4rag.search_space.src.parameter import Parameter
def my_custom_rule(combination: dict) -> bool:
"""Only allow large chunks with window retrieval."""
if combination["retrieval_method"] == "window":
return combination["chunk_size"] >= 1024
return True
search_space = AI4RAGSearchSpace(
params=[
# ... parameters
],
rules=[my_custom_rule] # Add custom rules here
)
Custom rule requirements:
- Function signature:
def rule_name(combination: dict) -> bool - Return
Trueif the combination is valid,Falseto filter it out combinationis a dict with all parameter names as keys
Example use cases:
- Domain-specific constraints (e.g., "small models need smaller chunks")
- Cost constraints (e.g., "don't use expensive models with large retrieval")
- Performance requirements (e.g., "window retrieval only with chunk_size > 512")
Search Space Size¶
The max_combinations property tells you how many valid configurations exist:
search_space = AI4RAGSearchSpace(
params=[
Parameter(name="chunk_size", param_type="C", values=[512, 1024, 2048]),
Parameter(name="chunk_overlap", param_type="C", values=[64, 128, 256]),
Parameter(name="retrieval_method", param_type="C", values=["simple", "window"]),
# ... other params
]
)
print(f"Search space size: {search_space.max_combinations}")
# Output: Search space size: 42 (after validation rules filter invalid combos)
Estimating Optimizer Settings
Use max_combinations to guide your optimizer settings: - For spaces with <20 combinations: max_evals=10 - For spaces with 20-100 combinations: max_evals=15-25 - For spaces with 100+ combinations: max_evals=30-50
Code Examples¶
Example 1: Minimal Search Space¶
Optimize chunking and retrieval with fixed models:
from ai4rag.search_space.src.search_space import AI4RAGSearchSpace
from ai4rag.search_space.src.parameter import Parameter
from ai4rag.rag.foundation_models.llama_stack import LSFoundationModel
from ai4rag.rag.embedding.llama_stack import LSEmbeddingModel
search_space = AI4RAGSearchSpace(
params=[
# Required: models (not optimized, just fixed)
Parameter(
name="foundation_model",
param_type="C",
values=[LSFoundationModel(model_id="ollama/llama3.2:3b", client=client)]
),
Parameter(
name="embedding_model",
param_type="C",
values=[
LSEmbeddingModel(
model_id="ollama/nomic-embed-text:latest",
client=client,
params={"embedding_dimension": 768, "context_length": 8192}
)
]
),
# Optimize chunking
Parameter(name="chunk_size", param_type="C", values=[512, 1024, 2048]),
Parameter(name="chunk_overlap", param_type="C", values=[64, 128, 256]),
# Optimize retrieval
Parameter(name="number_of_chunks", param_type="C", values=[3, 5, 7, 10]),
]
)
# Other parameters (retrieval_method, window_size, etc.) will use defaults
Example 2: Comprehensive Search Space¶
Explore chunking methods, retrieval strategies, and hybrid search:
from ai4rag.search_space.src.search_space import AI4RAGSearchSpace
from ai4rag.search_space.src.parameter import Parameter
search_space = AI4RAGSearchSpace(
params=[
# Models
Parameter(name="foundation_model", param_type="C", values=[model]),
Parameter(name="embedding_model", param_type="C", values=[embedding]),
# Chunking
Parameter(name="chunking_method", param_type="C", values=["recursive", "markdown"]),
Parameter(name="chunk_size", param_type="C", values=[512, 1024, 1536]),
Parameter(name="chunk_overlap", param_type="C", values=[64, 128, 200]),
# Retrieval
Parameter(name="retrieval_method", param_type="C", values=["simple", "window"]),
Parameter(name="window_size", param_type="C", values=[0, 1, 3, 5]),
Parameter(name="number_of_chunks", param_type="C", values=[5, 7, 10]),
# Hybrid search
Parameter(name="search_mode", param_type="C", values=["vector", "hybrid"]),
Parameter(name="ranker_strategy", param_type="C", values=["", "rrf", "weighted"]),
Parameter(name="ranker_k", param_type="C", values=[0, 30, 60, 100]),
Parameter(name="ranker_alpha", param_type="C", values=[1, 0.3, 0.5, 0.7]),
],
vector_store_type="ls_milvus" # Required for hybrid search
)
Example 3: Model Comparison¶
Compare different foundation models with fixed parameters:
search_space = AI4RAGSearchSpace(
params=[
# Optimize model selection
Parameter(
name="foundation_model",
param_type="C",
values=[
LSFoundationModel(model_id="ollama/llama3.2:3b", client=client),
LSFoundationModel(model_id="ollama/llama3.1:8b", client=client),
LSFoundationModel(model_id="ollama/mistral:7b", client=client),
]
),
# Fixed embedding and parameters
Parameter(name="embedding_model", param_type="C", values=[embedding]),
Parameter(name="chunk_size", param_type="C", values=[1024]),
Parameter(name="chunk_overlap", param_type="C", values=[128]),
Parameter(name="number_of_chunks", param_type="C", values=[5]),
]
)
Example 4: Custom Rule for Budget Constraints¶
Prevent expensive model + large retrieval combinations:
def budget_constraint(combination: dict) -> bool:
"""Don't use large models with high retrieval counts."""
model_id = str(combination["foundation_model"])
num_chunks = combination["number_of_chunks"]
# Large models (8b+) should use fewer chunks
if "8b" in model_id or "13b" in model_id:
return num_chunks <= 5
return True
search_space = AI4RAGSearchSpace(
params=[
Parameter(
name="foundation_model",
param_type="C",
values=[
LSFoundationModel(model_id="ollama/llama3.2:3b", client=client),
LSFoundationModel(model_id="ollama/llama3.1:8b", client=client),
]
),
Parameter(name="embedding_model", param_type="C", values=[embedding]),
Parameter(name="number_of_chunks", param_type="C", values=[3, 5, 7, 10]),
],
rules=[budget_constraint]
)
Best Practices¶
1. Start Small, Expand Gradually¶
Begin with a narrow search space to quickly find a baseline:
# Iteration 1: Minimal space
search_space = AI4RAGSearchSpace(
params=[
# ... models
Parameter(name="chunk_size", param_type="C", values=[1024]), # Fixed
Parameter(name="number_of_chunks", param_type="C", values=[5, 10]), # Optimize
]
)
Once you understand which parameters matter, expand:
# Iteration 2: Expanded space
search_space = AI4RAGSearchSpace(
params=[
# ... models
Parameter(name="chunk_size", param_type="C", values=[512, 1024, 2048]), # Now optimize
Parameter(name="chunk_overlap", param_type="C", values=[64, 128]), # Add overlap
Parameter(name="number_of_chunks", param_type="C", values=[5, 7, 10]), # Expand
]
)
2. Use Categorical for Most Parameters¶
Even for numeric values, prefer Categorical over Integer ranges:
# Preferred
Parameter(name="chunk_size", param_type="C", values=[200, 400, 800, 1600])
# Avoid (creates 1401 values!)
Parameter(name="chunk_size", param_type="I", v_min=200, v_max=1600)
3. Respect Validation Rules¶
Check your search space size to ensure rules aren't over-filtering:
search_space = AI4RAGSearchSpace(params=[...])
print(f"Valid combinations: {search_space.max_combinations}")
# If this is 0 or suspiciously low, check your parameter values
4. Avoid Overly Large Spaces¶
Search space grows multiplicatively. With 5 parameters, each with 4 values:
Keep individual parameter value counts reasonable (3-5 values per parameter).
5. Document Your Search Space¶
Add comments explaining your choices:
search_space = AI4RAGSearchSpace(
params=[
# Using 3b model for speed; 8b is too slow for our budget
Parameter(name="foundation_model", param_type="C", values=[model_3b]),
# Chunk sizes optimized for technical documentation (shorter paragraphs)
Parameter(name="chunk_size", param_type="C", values=[256, 512, 1024]),
# Higher retrieval counts since documents are highly fragmented
Parameter(name="number_of_chunks", param_type="C", values=[7, 10, 15]),
]
)
Troubleshooting¶
Error: "Missing required parameters"¶
Cause: You didn't provide foundation_model or embedding_model.
Solution: Always include both required parameters:
search_space = AI4RAGSearchSpace(
params=[
Parameter(name="foundation_model", param_type="C", values=[foundation_model]),
Parameter(name="embedding_model", param_type="C", values=[embedding_model]),
# ... other params
]
)
Error: "Not supported parameters were given"¶
Cause: You provided a parameter name that's not in AI4RAGParamNames.
Solution: Check the parameter name spelling and consult ai4rag.utils.constants.AI4RAGParamNames for valid names.
Search Space Size Is Zero¶
Cause: Validation rules are filtering out all combinations.
Solution:
- Check parameter value combinations manually
- Verify chunk_size > 2 * chunk_overlap for all combinations
- Ensure retrieval_method and window_size are consistent
- For hybrid search, verify ranker parameter sentinels
Optimizer Says "max_evals exceeded combinations"¶
Cause: You set max_evals higher than your search space size.
Solution: Reduce max_evals or expand your search space:
search_space = AI4RAGSearchSpace(params=[...])
print(f"Max combinations: {search_space.max_combinations}")
# Set max_evals <= max_combinations
optimizer_settings = GAMOptSettings(max_evals=15)
Related Topics¶
- Optimizers: How optimizers explore the search space
- Evaluation: Metrics used to score configurations
- Hybrid Search: Configuring hybrid search parameters
- Quick Start: Complete search space examples
Summary¶
Search spaces in ai4rag:
- Define parameters: Specify what to optimize and possible values
- Four types: Categorical (
"C"), Integer ("I"), Real ("R"), Boolean ("B") - Required:
foundation_modelandembedding_modelmust always be provided - Defaults: Sensible defaults for chunking, retrieval, and search parameters
- Validation rules: Built-in rules filter invalid combinations automatically
- Sentinel values:
"",0, and1indicate unused parameters - Custom rules: Add domain-specific constraints with custom validation functions
Start with a minimal search space, optimize the most impactful parameters first (chunking and retrieval), then expand to explore advanced features like hybrid search and model comparison.