Skip to main content
The Optimizer class finds Pareto-optimal RAG configurations using Multi-Objective Bayesian Optimization, balancing cost, latency, and quality.

Quick Start

from rag_opt.optimizer import Optimizer
from rag_opt.dataset import TrainDataset

# Load dataset and run optimization
dataset = TrainDataset.from_json("./rag_dataset.json")
optimizer = Optimizer(train_dataset=dataset, config_path="./rag_config.yaml")

# Get best configuration
best_config = optimizer.optimize(n_trials=3, best_one=True)
print(f"LLM: {best_config.llm.model}")
print(f"Embeddings: {best_config.embedding.model}")
print(f"Chunk size: {best_config.chunk_size}")

How It Works

  1. Setup: Loads search space and initializes components
  2. Bootstrap: Generates initial training data (10 samples)
  3. Optimize: Runs Bayesian Optimization loop proposing and evaluating configurations
  4. Return: Best configurations balancing multiple objectives

Configuration

Basic Usage

optimizer = Optimizer(
    train_dataset=dataset,
    config_path="./rag_config.yaml",
    verbose=True
)

# Run optimization (more trials = better exploration)
best_configs = optimizer.optimize(n_trials=3)

Advanced Options

from rag_opt import init_chat_model
from rag_opt.search_space import RAGSearchSpace

# Custom components
search_space = RAGSearchSpace.from_yaml("./custom_config.yaml")
eval_llm = init_chat_model(model="gpt-4", model_provider="openai")

optimizer = Optimizer(
    train_dataset=dataset,
    config_path="./rag_config.yaml",
    search_space=search_space,
    evaluator_llm=eval_llm,
    verbose=True
)

Performance Tips

  1. Start small: Test with n_trials=5, then scale to 50-100 for production
  2. Eager loading: For small search spaces, use eager_load=True in RAGPipelineManager
  3. Parallel evaluation: Adjust max_workers in RAGPipelineManager for faster optimization
  4. Hugginface: Using Hugging Face models will slow down the optimization process.