Note: All generation metrics are Based on paper Faster, Cheaper, Better
Multi-Objective Hyperparameter Optimization for LLM and RAG
Systems.
Quick Reference
| Metric | Measures | Key Question | Score Range |
|---|---|---|---|
| SafetyMetric | Factual grounding | Is it true? | 0-1 |
| AlignmentMetric | Usefulness & clarity | Is it helpful? | 0-1 |
| ResponseRelevancy | Query-answer match | Does it answer the question? | 0-1 |
SafetyMetric (Faithfulness)
Measures: Whether responses are grounded in retrieved contexts, preventing hallucinations.Basic Usage
Configuration
AlignmentMetric (Helpfulness)
Measures: Whether responses are useful, detailed, and clear.Basic Usage
ResponseRelevancy
Measures: Whether the response actually answers the question asked.Basic Usage
When to Use Each Metric
1. SafetyMetric- Factual accuracy is critical
- Compliance/legal requirements apply
- Working in high-stakes domains
- User experience is priority
- Optimizing for helpfulness
- Balancing detail vs brevity
- Ensuring on-topic responses
- Building Q&A systems
- Quality assurance for chatbots