RouterDC Selection
RouterDC uses semantic embeddings to match user queries with the most suitable model. It computes similarity between query embeddings and model representations to select the best match.
Reference: RouterDC: Query-Based Router by Dual Contrastive Learning (Guo et al., NeurIPS 2024) achieves +2.76% in-distribution and +1.90% out-of-distribution accuracy improvements.
The paper trains a query encoder using dual contrastive losses (Sample-LLM loss + Sample-Sample loss) with jointly learned LLM embeddings. Our implementation provides a simplified approach using pre-computed embeddings of model descriptions rather than jointly trained LLM-specific embeddings.
Algorithm Flow
Mathematical Foundation
Cosine Similarity
RouterDC uses cosine similarity to compare query and model embeddings:
sim(q, m) = (q · m) / (||q|| × ||m||)
= Σ(q_i × m_i) / (√Σq_i² × √Σm_i²)
Where:
q= Query embedding vector (e.g., 768 dimensions)m= Model description embedding vector- Result is in range [-1, 1], higher = more similar
Contrastive Learning
The embedding space is trained using dual contrastive losses:
- Sample-LLM Loss: Pulls query embeddings toward well-performing models and away from poor-performing ones
- Sample-Sample Loss: Groups similar queries together to ensure consistent routing
Core Algorithm (Go)
// Select using embedding similarity
func (s *RouterDCSelector) Select(ctx context.Context, selCtx *SelectionContext) (*SelectionResult, error) {
queryEmbedding, err := s.embedFunc(selCtx.Query)
if err != nil {
return nil, err
}
var bestModel string
var bestSim float64 = -1
for _, candidate := range selCtx.CandidateModels {
modelEmbedding := s.modelEmbeddings[candidate.Model]
sim := cosineSimilarity(queryEmbedding, modelEmbedding)
if sim > bestSim {
bestSim = sim
bestModel = candidate.Model
}
}
if bestSim < s.config.SimilarityThreshold {
return s.fallbackToDefault(selCtx)
}
return &SelectionResult{
SelectedModel: bestModel,
Score: bestSim,
Method: MethodRouterDC,
}, nil
}
How It Works
- Each model has a description and optional capabilities list
- Incoming queries are embedded into a vector representation
- Query embeddings are compared against model description embeddings
- The model with highest similarity score is selected
Configuration
decision:
algorithm:
type: router_dc
router_dc:
require_descriptions: true # Fail if models lack descriptions
use_capabilities: true # Include capabilities in matching
similarity_threshold: 0.3 # Minimum similarity to consider
models:
- name: gpt-4
backend: openai
description: "Advanced reasoning, complex analysis, mathematical proofs, and detailed explanations"
capabilities:
- reasoning
- mathematics
- code-review
- analysis
- name: gpt-3.5-turbo
backend: openai
description: "Fast responses for simple questions, casual conversation, and quick tasks"
capabilities:
- general
- chat
- summarization
- name: code-llama
backend: local
description: "Code generation, debugging, refactoring, and programming assistance"
capabilities:
- code-generation
- debugging
- refactoring