Agentic Routing on AMD ROCm

2026年6月18日 · 阅读需 14 分钟

Xunzhuo Liu

Intelligent Routing @vLLM

Haichen Zhang

Sr. AI Engineer @AMD

Andy Luo

Sr. Director @AMD

Most agent systems start with a simple idea: call model: auto and let the inference layer pick the right model. That is useful, but it is not enough for long-running agents.

A coding agent can begin with architecture work, call tools, receive short tool outputs, continue with "fix that", then ask a privacy-sensitive question in the same user session. The latest message may look simple, but the route cannot be chosen from the latest message alone. The router also has to know whether this is a safe moment to switch models.

This guide shows how to deploy that pattern on AMD ROCm with vLLM Semantic Router. You will start one ROCm vLLM backend, serve the agentic routing recipe, open the dashboard, validate the OpenAI-compatible API, and use Inferoa to experience route decisions and Router Learning behavior from an agent client.

Agent session routed through router memory to model paths
Agentic routing is not only choosing a model. It is choosing when to keep one.

Journal

1 篇博文含有标签「agentic」

Agentic Routing on AMD ROCm