RouterDC Selection
RouterDC uses semantic embeddings to match user queries with the most suitable model. It computes similarity between query embeddings and model representations to select the best match.
Reference: RouterDC: Query-Based Router by Dual Contrastive Learning (Guo et al., NeurIPS 2024) achieves +2.76% in-distribution and +1.90% out-of-distribution accuracy improvements.
The paper trains a query encoder using dual contrastive losses (Sample-LLM loss + Sample-Sample loss) with jointly learned LLM embeddings. Our implementation provides a simplified approach using pre-computed embeddings of model descriptions rather than jointly trained LLM-specific embeddings.
Algorithm Flowâ
Mathematical Foundationâ
Cosine Similarityâ
RouterDC uses cosine similarity to compare query and model embeddings:
sim(q, m) = (q ¡ m) / (||q|| à ||m||)
= ÎŖ(q_i à m_i) / (âÎŖq_i² à âÎŖm_i²)
Where:
q= Query embedding vector (e.g., 768 dimensions)m= Model description embedding vector- Result is in range [-1, 1], higher = more similar
Contrastive Learningâ
The embedding space is trained using dual contrastive losses:
- Sample-LLM Loss: Pulls query embeddings toward well-performing models and away from poor-performing ones
- Sample-Sample Loss: Groups similar queries together to ensure consistent routing
Core Algorithm (Go)â
// Select using embedding similarity
func (s *RouterDCSelector) Select(ctx context.Context, selCtx *SelectionContext) (*SelectionResult, error) {
queryEmbedding, err := s.embedFunc(selCtx.Query)
if err != nil {
return nil, err
}
var bestModel string
var bestSim float64 = -1
for _, candidate := range selCtx.CandidateModels {
modelEmbedding := s.modelEmbeddings[candidate.Model]
sim := cosineSimilarity(queryEmbedding, modelEmbedding)
if sim > bestSim {
bestSim = sim
bestModel = candidate.Model
}
}
if bestSim < s.config.SimilarityThreshold {
return s.fallbackToDefault(selCtx)
}
return &SelectionResult{
SelectedModel: bestModel,
Score: bestSim,
Method: MethodRouterDC,
}, nil
}
How It Worksâ
- Each model has a description and optional capabilities list
- Incoming queries are embedded into a vector representation
- Query embeddings are compared against model description embeddings
- The model with highest similarity score is selected
Configurationâ
decision:
algorithm:
type: router_dc
router_dc:
require_descriptions: true # Fail if models lack descriptions
use_capabilities: true # Include capabilities in matching
similarity_threshold: 0.3 # Minimum similarity to consider
models:
- name: gpt-4
backend: openai
description: "Advanced reasoning, complex analysis, mathematical proofs, and detailed explanations"
capabilities:
- reasoning
- mathematics
- code-review
- analysis
- name: gpt-3.5-turbo
backend: openai
description: "Fast responses for simple questions, casual conversation, and quick tasks"
capabilities:
- general
- chat
- summarization
- name: code-llama
backend: local
description: "Code generation, debugging, refactoring, and programming assistance"
capabilities:
- code-generation
- debugging
- refactoring
Writing Effective Descriptionsâ
Good descriptions are specific and differentiate models:
Good:
description: "Mathematical reasoning, theorem proving, step-by-step problem solving"
Bad:
description: "A good AI model" # Too vague
Description Tipsâ
- Be specific: Mention concrete tasks the model excels at
- Use keywords: Include terms users might use in queries
- Differentiate: Highlight what makes this model unique
- Keep concise: 1-2 sentences, focused on strengths
Capabilities Listâ
Capabilities provide structured metadata for matching:
capabilities:
- code-generation # Primary strength
- python # Language specialization
- debugging # Related task
When use_capabilities: true, capabilities are combined with the description for richer matching.
Validationâ
Enable strict validation to catch configuration issues:
router_dc:
require_descriptions: true
With this enabled, the router will fail to start if any model lacks a description.
Best Practicesâ
- Invest in descriptions: Quality descriptions dramatically improve routing
- Test with real queries: Verify routing matches expectations
- Update descriptions: Refine based on observed misroutes
- Use capabilities sparingly: 3-5 focused capabilities per model
- Enable require_descriptions: Catch missing descriptions at startup