Skip to main content
Documentation

Redis Cluster Storage for Response API

This guide covers Redis Cluster deployment for the Response API, providing high availability, automatic failover, and data sharding across multiple nodes.

Version: Latest

Redis Cluster Storage for Response API

This guide covers Redis Cluster deployment for the Response API, providing high availability, automatic failover, and data sharding across multiple nodes.

Note: For simple standalone Redis setup, see Redis Storage Guide.

What is Redis Cluster?

Redis Cluster provides:

  • Data sharding: Automatically distributes data across multiple master nodes (16384 hash slots)
  • High availability: Automatic failover if a master fails (requires replicas)
  • Horizontal scaling: Add more nodes to increase capacity
  • No single point of failure: Data replicated across nodes

vs. Standalone Redis:

  • Standalone = 1 node, simple, good for dev/small deployments
  • Cluster = 6+ nodes (3 masters + 3 replicas), production-ready

Setup and Deployment

1. Start Redis Cluster

Step 1: Create Docker Network

docker network create redis-cluster-net

Step 2: Start 6 Redis Nodes

for port in 7001 7002 7003 7004 7005 7006; do
docker run -d \
--name redis-node-$port \
--network redis-cluster-net \
-p $port:6379 \
redis:7-alpine \
redis-server --cluster-enabled yes \
--cluster-config-file nodes.conf \
--cluster-node-timeout 5000 \
--appendonly yes \
--port 6379
done

What this does:

  • Starts 6 independent Redis servers
  • Enables cluster mode on each
  • Ports 7001-7006 exposed on localhost

Step 3: Create the Cluster

docker run --rm --network redis-cluster-net redis:7-alpine \
redis-cli --cluster create \
redis-node-7001:6379 \
redis-node-7002:6379 \
redis-node-7003:6379 \
redis-node-7004:6379 \
redis-node-7005:6379 \
redis-node-7006:6379 \
--cluster-replicas 1 --cluster-yes

What this does:

  • Connects the 6 nodes into a cluster
  • Creates 3 masters (7001, 7002, 7003)
  • Creates 3 replicas (7004, 7005, 7006)
  • Distributes hash slots: 0-5460, 5461-10922, 10923-16383

Step 4: Verify Cluster is Running

docker exec redis-node-7001 redis-cli cluster info
docker exec redis-node-7001 redis-cli cluster nodes

2. Configure Semantic Router

Option 1: Inline Configuration

Edit config/config.yaml:

response_api:
enabled: true
store_backend: "redis"
ttl_seconds: 86400
redis:
cluster_mode: true
cluster_addresses:
- "127.0.0.1:7001"
- "127.0.0.1:7002"
- "127.0.0.1:7003"
- "127.0.0.1:7004"
- "127.0.0.1:7005"
- "127.0.0.1:7006"
db: 0 # MUST be 0 for cluster
key_prefix: "sr:"
pool_size: 20 # Higher for cluster
max_retries: 5 # More retries for redirects
dial_timeout: 10 # Longer for cluster

Option 2: External Config File

Edit config/config.yaml:

response_api:
enabled: true
store_backend: "redis"
ttl_seconds: 86400
redis:
config_path: "config/response-api/redis-cluster.yaml"

Then edit config/response-api/redis-cluster.yaml with cluster addresses.

3. Run Semantic Router

make build-router
make run-router

4. Run EnvoyProxy

# Start Envoy proxy
make run-envoy

5. Verify Cluster Initialization

Check logs for cluster initialization:

tail -f /tmp/router.log | grep -i "cluster\|redis"

Expected:

RedisStore: creating cluster client (nodes=6, pool_size=20)
RedisStore: initialized successfully (cluster_mode=true, key_prefix=sr:, ttl=24h0m0s)
Response API enabled with redis backend

6. Test Response API

Note: The examples below use llm-katan (Qwen3-0.6B) as the LLM backend. Adjust the model name to match your vLLM configuration.

Test 1: Create Response

curl -X POST http://localhost:8801/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3",
"input": "What is Redis Cluster?",
"instructions": "You are a database expert.",
"store": true
}'

Response:

{
"id": "resp_bb63817af32280b4a3a8fb7f",
"object": "response",
"status": "completed",
"model": "qwen3",
...
}

Test 2: Verify Data Distribution

Check which node stores the data:

for port in 7001 7002 7003 7004 7005 7006; do
echo "=== Node $port ==="
docker exec redis-node-$port redis-cli KEYS "sr:*"
done

Example output:

=== Node 7001 ===
sr:response:resp_bb63817af32280b4a3a8fb7f
=== Node 7002 ===

=== Node 7003 ===

=== Node 7004 ===

=== Node 7005 ===
sr:response:resp_bb63817af32280b4a3a8fb7f # Replica of 7001
=== Node 7006 ===

This shows:

  • Master 7001 has the data (hash slot matched)
  • Replica 7005 has a copy (backup)
  • Other nodes are empty (different hash slots)

Test 3: Retrieve Response

curl -X GET http://localhost:8801/v1/responses/resp_bb63817af32280b4a3a8fb7f

The client automatically:

  • Calculates hash slot for the key
  • Routes request to correct node (7001)
  • Handles MOVED redirects if needed

Test 4: Conversation Chaining

curl -X POST http://localhost:8801/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3",
"input": "Tell me more about sharding",
"previous_response_id": "resp_bb63817af32280b4a3a8fb7f",
"store": true
}'

Response:

{
"id": "resp_a4ae205a80ae7bf10edecaa3",
"previous_response_id": "resp_bb63817af32280b4a3a8fb7f",
"status": "completed",
...
}

Test 5: Delete Response

curl -X DELETE http://localhost:8801/v1/responses/resp_bb63817af32280b4a3a8fb7f

Deletion works across cluster:

  • Client finds correct node
  • Deletes from master
  • Replica syncs automatically

Cluster Monitoring

Check Cluster Health

docker exec redis-node-7001 redis-cli cluster info

Key metrics:

  • cluster_state:ok - Cluster is healthy
  • cluster_slots_assigned:16384 - All slots assigned
  • cluster_known_nodes:6 - All nodes discovered

View Node Roles

docker exec redis-node-7001 redis-cli cluster nodes

Output shows:

  • Master nodes with hash slot ranges
  • Replica nodes and which master they backup

Monitor Keys per Node

for port in 7001 7002 7003; do
count=$(docker exec redis-node-$port redis-cli DBSIZE)
echo "Master $port: $count keys"
done

Cleanup

Stop and Remove All Nodes

for port in 7001 7002 7003 7004 7005 7006; do
docker stop redis-node-$port
docker rm redis-node-$port
done

docker network rm redis-cluster-net

Reference

  • Redis Storage (Standalone) - Simple standalone setup
  • Configuration: config/response-api/redis-cluster.yaml
  • Integration tests: pkg/responsestore/redis_store_integration_test.go