Skip to main content
Documentation

Hybrid

Overview

Version: v0.3

Hybrid

Overview

hybrid is a composite selection algorithm that combines multiple ranking signals — Elo ratings, Router-DC embedding similarity, AutoMix POMDP values, and cost — into one weighted score.

It aligns to config/algorithm/selection/hybrid.yaml.

Paper: Hybrid LLM: Cost-Efficient Quality-Aware Query Routing

Key Advantages

  • Blends multiple selectors instead of committing to only one.
  • Makes weighting explicit and easy to audit.
  • Supports gradual migration between ranking policies (e.g., transition from static to Elo).
  • Cost-aware scoring to balance quality and operational expense.

Algorithm Principle

Hybrid computes a weighted composite score for each candidate model:

S(m)=weloR^elo(m)+wrdcR^rdc(m)+wamixR^amix(m)wcostC^(m)S(m) = w_{\text{elo}} \cdot \hat{R}_{\text{elo}}(m) + w_{\text{rdc}} \cdot \hat{R}_{\text{rdc}}(m) + w_{\text{amix}} \cdot \hat{R}_{\text{amix}}(m) - w_{\text{cost}} \cdot \hat{C}(m)

When normalize_scores is enabled (default), each component is min-max normalized to [0, 1] before combination, ensuring fair weighting regardless of scale differences.

quality_gap_threshold is retained in the selector contract for data-driven upgrade policy experiments, while the current online selector uses the weighted composite score directly for the final pick.

Select Flow

Component Selectors

The Hybrid selector internally instantiates three sub-selectors:

ComponentSourceWhat it provides
EloSelectorFeedback historyHistorical pairwise comparison scores
RouterDCSelectorModel descriptionsSemantic query-model similarity
AutoMixSelectorPOMDP solverCost-quality optimal value estimate

Each component shares the same SelectionContext and runs independently.

What Problem Does It Solve?

No single ranking signal is reliable for every workload: pure cost, pure similarity, or pure feedback each misses part of the routing picture. hybrid combines multiple selectors into one auditable score so routes can balance semantic fit, historical quality, and operational cost.

When to Use

  • One route should combine several ranking signals.
  • You want a weighted transition between older and newer selectors.
  • No single selector captures all relevant information.
  • The final choice should reflect both quality and operational cost.

Known Limitations

  • Higher computational cost than any single selector (runs 3 sub-selectors per request).
  • Weight tuning requires domain knowledge — suboptimal weights can degrade performance.
  • quality_gap_threshold is exposed for compatibility with lookup-table and upgrade-policy work, but it does not currently run an MLP escalation pass.

Configuration

algorithm:
type: hybrid
hybrid:
elo_weight: 0.3 # Weight for Elo rating
router_dc_weight: 0.3 # Weight for embedding similarity
automix_weight: 0.2 # Weight for POMDP value
cost_weight: 0.2 # Weight for cost consideration
quality_gap_threshold: 0.1 # Reserved quality-gap threshold
normalize_scores: true # Normalize component scores to [0,1]

Parameters

ParameterTypeDefaultDescription
elo_weightfloat0.3Weight for Elo rating contribution (0–1)
router_dc_weightfloat0.3Weight for RouterDC embedding similarity (0–1)
automix_weightfloat0.2Weight for AutoMix POMDP value (0–1)
cost_weightfloat0.2Weight for cost consideration (0–1)
quality_gap_thresholdfloat0.1Reserved quality-gap threshold for upgrade-policy experiments
normalize_scoresbooltrueNormalize component scores before combination

Feedback

Hybrid forwards UpdateFeedback() to all three component selectors (Elo, RouterDC, AutoMix) so each can learn independently from the same feedback signal.