Octo Router logoOctoRouter

Caching & Redis

Configure Redis for high-performance caching and state management.

Octo Router leverages Redis to provide a consistent, shared state across multiple instances, ensuring that budgets and rate limits are enforced globally.

Role of Redis

Redis currently serves as the State Management engine for the router. In a distributed environment, it ensures that:

  • Budgets are tracked across all instances, preventing overspending.
  • Rate Limits (Requests per minute/day) are synchronized cluster-wide.

[!TIP] If you are running a single instance of Octo Router, Redis is optional. The router will fall back to in-memory tracking automatically.

Persistence Configuration

To enable Redis, configure the redis section in your config.yaml:

redis:
  addr: "localhost:6379"  # Or "redis:6379" if using Docker Compose
  password: ""
  db: 0

State Sharing

Once configured, all Octo Router instances connected to the same Redis database will sync their usage metrics in real-time. This is handled at the Bouncer Layer (the pipeline), where filters query Redis before allowing a request to proceed.

Response Caching (Planned)

We are currently working on a high-performance response caching layer. Once active, it will support:

  • Exact Match Caching: Hashing requests to store and retrieve identical prompts.
  • Semantic Caching: Using vector similarity to serve cached responses for prompts with the same "meaning".
  • Advanced Rules: Fine-grained control over what gets cached based on model, user, or response size.

Current Configuration (Skeleton)

The following fields are reserved for future caching updates:

cache:
  enabled: false  # Not yet integrated into completion logic
  ttl: 3600

On this page