Documentation Index
Fetch the complete documentation index at: https://ragopt.aboneda.com/llms.txt
Use this file to discover all available pages before exploring further.
The Optimizer class finds Pareto-optimal RAG configurations using Multi-Objective Bayesian Optimization, balancing cost, latency, and quality.
Quick Start
from rag_opt.optimizer import Optimizer
from rag_opt.dataset import TrainDataset
# Load dataset and run optimization
dataset = TrainDataset.from_json("./rag_dataset.json")
optimizer = Optimizer(train_dataset=dataset, config_path="./rag_config.yaml")
# Get best configuration
best_config = optimizer.optimize(n_trials=3, best_one=True)
print(f"LLM: {best_config.llm.model}")
print(f"Embeddings: {best_config.embedding.model}")
print(f"Chunk size: {best_config.chunk_size}")
How It Works
- Setup: Loads search space and initializes components
- Bootstrap: Generates initial training data (10 samples)
- Optimize: Runs Bayesian Optimization loop proposing and evaluating configurations
- Return: Best configurations balancing multiple objectives
Configuration
Basic Usage
optimizer = Optimizer(
train_dataset=dataset,
config_path="./rag_config.yaml",
verbose=True
)
# Run optimization (more trials = better exploration)
best_configs = optimizer.optimize(n_trials=3)
Advanced Options
from rag_opt import init_chat_model
from rag_opt.search_space import RAGSearchSpace
# Custom components
search_space = RAGSearchSpace.from_yaml("./custom_config.yaml")
eval_llm = init_chat_model(model="gpt-4", model_provider="openai")
optimizer = Optimizer(
train_dataset=dataset,
config_path="./rag_config.yaml",
search_space=search_space,
evaluator_llm=eval_llm,
verbose=True
)
- Start small: Test with
n_trials=5, then scale to 50-100 for production
- Eager loading: For small search spaces, use
eager_load=True in RAGPipelineManager
- Parallel evaluation: Adjust
max_workers in RAGPipelineManager for faster optimization
- Hugginface: Using Hugging Face models will slow down the optimization process.