Skip to main content
Blog

Journal

Release notes, field reports, and research commentary from the vLLM Semantic Router project.

One post tagged with "agentgateway"

View All Tags

Giving AgentGateway a Semantic Brain with vLLM Semantic Router

· 10 min read
Aayush Saini
SDE, Data and AI @ Red Hat
Anup Sharma
AI & Distributed System @ Nutanix

vLLM Agent Architecture Workflow: Custom Semantic Routing with AgentGateway and Semantic Router

Agent systems that span multiple models — a local endpoint for coding, a frontier cloud model for deep reasoning, and a fast general-purpose model for everyday tasks — all face the same routing question: how should each request be directed to the right backend?

Many deployments start with a lightweight Python proxy or keyword matcher in front of the gateway. That approach works at small scale, but misroutes grow quickly as traffic, languages, and task types diversify. This post shows how vLLM Semantic Router running as an Envoy ExtProc sidecar inside AgentGateway replaces that pattern with semantic, config-driven routing.