Built on Encoder Models
Encoder-Based Intelligence
Purpose-built encoder models extract meaning from every request โ understanding intent, ranking relevance, and classifying content across modalities in real time.
Multi-Modality
Detect and route text, image and audio inputs to the right modality-capable model.
Bi-Encoder Embeddings
Independently encode queries and candidates into dense vectors for similarity search and semantic caching.
Cross-Encoder Learning
Joint cross-attention scoring of query-candidate pairs for high-precision reranking.
Classification
Domain, jailbreak, PII and fact-check classification across 14 MMLU categories via ModernBERT with LoRA.
Full Attention
Bidirectional attention across tokens and sentences โ full context in both directions, not causal masking.
2DMSE
Adjust embedding layers and dimensions at inference time to trade compute for accuracy on the fly.
MRL
Truncate embedding vectors to any dimension without retraining โ balance accuracy and speed per request.
๐๏ธ Architecture

๐ฏ Our Goals
Building the System Level Intelligence for Mixture-of-Models (MoM), bringing Collective Intelligence into LLM systems

๐ Where it lives
It lives between the real world and models

๐ฅ Meet Our Team
The amazing people behind vLLM Semantic Router
MaintainerHuamin Chen
Distinguished Engineer @Red Hat
MaintainerChen Wang
Senior Staff Research Scientist @IBM
MaintainerYue Zhu
Staff Research Scientist @IBM
MaintainerXunzhuo Liu
Intelligent Routing @vLLM
CommitterSenan Zedan
R&D Manager @Red Hat
Committersamzong
AI Infrastructure / Cloud-Native PM @DaoCloud
Liav Weiss
Software Engineer @Red Hat
Asaad Balum
Senior Software Engineer @Red Hat
Yehudit
Software Engineer @Red Hat
Noa Limoy
Software Engineer @Red Hat
CommitterJaredforReal
Software Engineer @Z.ai
Srinivas A
Software Engineer @Yokogawa
carlory
Open Source Engineer @DaoCloud
CommitterYossi Ovadia
Senior Principal Engineer @Red Hat
CommitterJintao Zhang
Senior Software Engineer @Kong
Committeryuluo-yx
Individual Contributor
Committercryo-zd
Individual Contributor
CommitterOneZero-Y
Individual Contributor
Committeraeft
Individual Contributor
MaintainerHuamin Chen
Distinguished Engineer @Red Hat
MaintainerChen Wang
Senior Staff Research Scientist @IBM
MaintainerYue Zhu
Staff Research Scientist @IBM
MaintainerXunzhuo Liu
Intelligent Routing @vLLM
CommitterSenan Zedan
R&D Manager @Red Hat
Committersamzong
AI Infrastructure / Cloud-Native PM @DaoCloud
Liav Weiss
Software Engineer @Red Hat
Asaad Balum
Senior Software Engineer @Red Hat
Yehudit
Software Engineer @Red Hat
Noa Limoy
Software Engineer @Red Hat
CommitterJaredforReal
Software Engineer @Z.ai
Srinivas A
Software Engineer @Yokogawa
carlory
Open Source Engineer @DaoCloud
CommitterYossi Ovadia
Senior Principal Engineer @Red Hat
CommitterJintao Zhang
Senior Software Engineer @Kong
Committeryuluo-yx
Individual Contributor
Committercryo-zd
Individual Contributor
CommitterOneZero-Y
Individual Contributor
Committeraeft
Individual Contributor
MaintainerHuamin Chen
Distinguished Engineer @Red Hat
MaintainerChen Wang
Senior Staff Research Scientist @IBM
MaintainerYue Zhu
Staff Research Scientist @IBM
MaintainerXunzhuo Liu
Intelligent Routing @vLLM
CommitterSenan Zedan
R&D Manager @Red Hat
Committersamzong
AI Infrastructure / Cloud-Native PM @DaoCloud
Liav Weiss
Software Engineer @Red Hat
Asaad Balum
Senior Software Engineer @Red Hat
Yehudit
Software Engineer @Red Hat
Noa Limoy
Software Engineer @Red Hat
CommitterJaredforReal
Software Engineer @Z.ai
Srinivas A
Software Engineer @Yokogawa
carlory
Open Source Engineer @DaoCloud
CommitterYossi Ovadia
Senior Principal Engineer @Red Hat
CommitterJintao Zhang
Senior Software Engineer @Kong
Committeryuluo-yx
Individual Contributor
Committercryo-zd
Individual Contributor
CommitterOneZero-Y
Individual Contributor
Committeraeft
Individual Contributor
Acknowledgements
vLLM Semantic Router is born in open source and built on open source โค๏ธ







