Version: v0.1

Troubleshooting & FAQ

Common issues and frequently asked questions about model selection.

Frequently Asked Questions

Which algorithm should I start with?

Start with Static selection if you're new to model selection. It's deterministic and easy to debug. Once you understand your traffic patterns, migrate to adaptive algorithms.

Do I need to configure all algorithms?

No. Configure only the algorithm you're using. Each algorithm has sensible defaults, so you only need to specify fields you want to customize.

Can I switch algorithms without downtime?

Yes. Algorithm changes take effect on configuration reload. In-flight requests complete with the previous algorithm.

Common Issues

Elo Selection

Issue: Ratings not changing

Possible causes:

Feedback not being submitted - verify POST requests to /api/v1/feedback return 200
K-factor too low - increase from 32 to 64 for faster adaptation
Not enough traffic - Elo needs consistent feedback volume

# Verify feedback endpoint is working
curl -X POST http://localhost:8080/api/v1/feedback \
  -H "Content-Type: application/json" \
  -d '{"request_id": "test", "model": "gpt-4", "rating": 1}'

Issue: One model always selected

This is expected if one model has significantly higher Elo rating. Options:

Reset ratings by deleting storage_path file
Increase k_factor to allow faster rating changes
Use decay_factor to reduce weight of old comparisons

RouterDC Selection

Issue: Wrong model selected for queries

Check model descriptions are specific enough:

# Bad - too generic
description: "A good AI model"

# Good - specific capabilities
description: "Mathematical reasoning, theorem proving, step-by-step solutions"

Verify embeddings are being computed:

# Check metrics for embedding latency
curl http://localhost:8080/metrics | grep embedding

Issue: Startup failure with "missing descriptions"

If require_descriptions: true, all models must have descriptions:

models:
  - name: gpt-4
    description: "Required when require_descriptions is true"

AutoMix Selection

Issue: Always selecting expensive models

Your cost_quality_tradeoff is too low (favoring quality). Increase it:

automix:
  cost_quality_tradeoff: 0.5  # Balance cost and quality

Issue: Always selecting cheap models

Your cost_quality_tradeoff is too high. Decrease it:

automix:
  cost_quality_tradeoff: 0.2  # Favor quality

Issue: Missing pricing data

AutoMix requires pricing information:

models:
  - name: gpt-4
    pricing:
      input_cost_per_1k: 0.03
      output_cost_per_1k: 0.06

Hybrid Selection

Issue: Weights validation error

Weights must sum to 1.0 (±0.01 tolerance):

hybrid:
  elo_weight: 0.3
  router_dc_weight: 0.3
  automix_weight: 0.2
  cost_weight: 0.2
  # Total: 1.0 ✓

Issue: Component not contributing

Ensure the component has required data:

Elo: needs feedback history
RouterDC: needs model descriptions
AutoMix: needs pricing data

Debugging Tips

Enable verbose logging

logging:
  level: debug

Check selection metrics

curl http://localhost:8080/metrics | grep selection

Key metrics:

model_selection_duration_seconds - selection latency
model_selection_total - selection counts by algorithm
model_elo_rating - current Elo ratings (if using Elo)

Trace individual requests

Add request ID header and check logs:

curl -H "X-Request-ID: debug-123" http://localhost:8080/v1/chat/completions ...

Then search logs:

vllm-sr logs router | grep debug-123

Getting Help

If you're still stuck:

Check GitHub Issues for similar problems
Enable debug logging and capture relevant output
Open a new issue with:
- Configuration (redact secrets)
- Steps to reproduce
- Expected vs actual behavior
- Relevant log output

Frequently Asked Questions​

Which algorithm should I start with?​

Do I need to configure all algorithms?​

Can I switch algorithms without downtime?​

Common Issues​

Elo Selection​

RouterDC Selection​

AutoMix Selection​

Hybrid Selection​

Debugging Tips​

Enable verbose logging​

Check selection metrics​

Trace individual requests​

Getting Help​