Skip to main content
Version: v0.1

RouterDC Selection

RouterDC uses semantic embeddings to match user queries with the most suitable model. It computes similarity between query embeddings and model representations to select the best match.

Reference: RouterDC: Query-Based Router by Dual Contrastive Learning (Guo et al., NeurIPS 2024) achieves +2.76% in-distribution and +1.90% out-of-distribution accuracy improvements.

The paper trains a query encoder using dual contrastive losses (Sample-LLM loss + Sample-Sample loss) with jointly learned LLM embeddings. Our implementation provides a simplified approach using pre-computed embeddings of model descriptions rather than jointly trained LLM-specific embeddings.

Algorithm Flow​

Mathematical Foundation​

Cosine Similarity​

RouterDC uses cosine similarity to compare query and model embeddings:

sim(q, m) = (q Ā· m) / (||q|| Ɨ ||m||)
= Ī£(q_i Ɨ m_i) / (√Σq_i² Ɨ √Σm_i²)

Where:

  • q = Query embedding vector (e.g., 768 dimensions)
  • m = Model description embedding vector
  • Result is in range [-1, 1], higher = more similar

Contrastive Learning​

The embedding space is trained using dual contrastive losses:

  • Sample-LLM Loss: Pulls query embeddings toward well-performing models and away from poor-performing ones
  • Sample-Sample Loss: Groups similar queries together to ensure consistent routing

Core Algorithm (Go)​

// Select using embedding similarity
func (s *RouterDCSelector) Select(ctx context.Context, selCtx *SelectionContext) (*SelectionResult, error) {
queryEmbedding, err := s.embedFunc(selCtx.Query)
if err != nil {
return nil, err
}

var bestModel string
var bestSim float64 = -1

for _, candidate := range selCtx.CandidateModels {
modelEmbedding := s.modelEmbeddings[candidate.Model]
sim := cosineSimilarity(queryEmbedding, modelEmbedding)

if sim > bestSim {
bestSim = sim
bestModel = candidate.Model
}
}

if bestSim < s.config.SimilarityThreshold {
return s.fallbackToDefault(selCtx)
}

return &SelectionResult{
SelectedModel: bestModel,
Score: bestSim,
Method: MethodRouterDC,
}, nil
}

How It Works​

  1. Each model has a description and optional capabilities list
  2. Incoming queries are embedded into a vector representation
  3. Query embeddings are compared against model description embeddings
  4. The model with highest similarity score is selected

Configuration​

decision:
algorithm:
type: router_dc
router_dc:
require_descriptions: true # Fail if models lack descriptions
use_capabilities: true # Include capabilities in matching
similarity_threshold: 0.3 # Minimum similarity to consider

models:
- name: gpt-4
backend: openai
description: "Advanced reasoning, complex analysis, mathematical proofs, and detailed explanations"
capabilities:
- reasoning
- mathematics
- code-review
- analysis

- name: gpt-3.5-turbo
backend: openai
description: "Fast responses for simple questions, casual conversation, and quick tasks"
capabilities:
- general
- chat
- summarization

- name: code-llama
backend: local
description: "Code generation, debugging, refactoring, and programming assistance"
capabilities:
- code-generation
- debugging
- refactoring

Writing Effective Descriptions​

Good descriptions are specific and differentiate models:

Good:

description: "Mathematical reasoning, theorem proving, step-by-step problem solving"

Bad:

description: "A good AI model"  # Too vague

Description Tips​

  1. Be specific: Mention concrete tasks the model excels at
  2. Use keywords: Include terms users might use in queries
  3. Differentiate: Highlight what makes this model unique
  4. Keep concise: 1-2 sentences, focused on strengths

Capabilities List​

Capabilities provide structured metadata for matching:

capabilities:
- code-generation # Primary strength
- python # Language specialization
- debugging # Related task

When use_capabilities: true, capabilities are combined with the description for richer matching.

Validation​

Enable strict validation to catch configuration issues:

router_dc:
require_descriptions: true

With this enabled, the router will fail to start if any model lacks a description.

Best Practices​

  1. Invest in descriptions: Quality descriptions dramatically improve routing
  2. Test with real queries: Verify routing matches expectations
  3. Update descriptions: Refine based on observed misroutes
  4. Use capabilities sparingly: 3-5 focused capabilities per model
  5. Enable require_descriptions: Catch missing descriptions at startup