Caching & Redis
Configure Redis for high-performance caching and state management.
Octo Router leverages Redis to provide a consistent, shared state across multiple instances, ensuring that budgets and rate limits are enforced globally.
Role of Redis
Redis currently serves as the State Management engine for the router. In a distributed environment, it ensures that:
- Budgets are tracked across all instances, preventing overspending.
- Rate Limits (Requests per minute/day) are synchronized cluster-wide.
[!TIP] If you are running a single instance of Octo Router, Redis is optional. The router will fall back to in-memory tracking automatically.
Persistence Configuration
To enable Redis, configure the redis section in your config.yaml:
redis:
addr: "localhost:6379" # Or "redis:6379" if using Docker Compose
password: ""
db: 0State Sharing
Once configured, all Octo Router instances connected to the same Redis database will sync their usage metrics in real-time. This is handled at the Bouncer Layer (the pipeline), where filters query Redis before allowing a request to proceed.
Response Caching (Planned)
We are currently working on a high-performance response caching layer. Once active, it will support:
- Exact Match Caching: Hashing requests to store and retrieve identical prompts.
- Semantic Caching: Using vector similarity to serve cached responses for prompts with the same "meaning".
- Advanced Rules: Fine-grained control over what gets cached based on model, user, or response size.
Current Configuration (Skeleton)
The following fields are reserved for future caching updates:
cache:
enabled: false # Not yet integrated into completion logic
ttl: 3600