Install in Docker Compose
This guide provides step-by-step instructions for deploying the vLLM Semantic Router with Envoy AI Gateway on Docker Compose.
Common Prerequisitesβ
-
Docker Engine: see more in Docker Engine Installation
-
Clone repoοΌ
git clone https://github.com/vllm-project/semantic-router.git
cd semantic-router -
Download classification models (β1.5GB, first run only):
make download-models
This downloads the classification models used by the router:
- Category classifier (ModernBERT-base)
- PII classifier (ModernBERT-base)
- Jailbreak classifier (ModernBERT-base)
Requirementsβ
-
Docker Compose v2 (
docker compose
command, not the legacydocker-compose
)Install Docker Compose Plugin (if missing), see more in Docker Compose Plugin Installation
# For Debian / Ubuntu
sudo apt-get update
sudo apt-get install -y docker-compose-plugin
# For RHEL / CentOS / Fedora
sudo yum update -y
sudo yum install -y docker-compose-plugin
# Verify
docker compose version -
Ensure ports 8801, 50051, 19000, 3000 and 9090 are free
Start Servicesβ
# Core (router + envoy)
docker compose up --build
# Detached (recommended once OK)
docker compose up -d --build
# Include mock vLLM + testing profile (points router to mock endpoint)
CONFIG_FILE=/app/config/config.testing.yaml \
docker compose --profile testing up --build
Verifyβ
- gRPC:
localhost:50051
- Envoy HTTP:
http://localhost:8801
- Envoy Admin:
http://localhost:19000
- Prometheus:
http://localhost:9090
- Grafana:
http://localhost:3000
(admin
/admin
for first login)
Common Operationsβ
# View service status
docker compose ps
# Follow logs for the router service
docker compose logs -f semantic-router
# Exec into the router container
docker compose exec semantic-router bash
# Recreate after config change
docker compose up -d --build
# Stop and clean up containers
docker compose down