Skip to main content
Documentation

Semantic Cache

Overview

Version: Latest

Semantic Cache

Overview

semantic-cache is a route-local plugin for reusing semantically similar prior responses.

It aligns to config/plugin/semantic-cache/high-recall.yaml and config/plugin/semantic-cache/memory.yaml.

Key Advantages

  • Reuses prior responses only on routes that benefit from cache hits.
  • Keeps route-local thresholds separate from global store setup.
  • Supports different cache policies for different routes.

What Problem Does It Solve?

Some routes benefit strongly from reuse, while others need fresh generation every time. semantic-cache keeps the reuse policy local to the route instead of making cache behavior global by default.

When to Use

  • one route should prefer cached responses when queries are very similar
  • different routes need different similarity thresholds or TTLs
  • the route should use a shared semantic cache backend configured in global.stores.semantic_cache

Configuration

Use this fragment under routing.decisions[].plugins:

plugin:
type: semantic-cache
configuration:
enabled: true
similarity_threshold: 0.92
ttl_seconds: 86400