Cost Management
Learn how to control costs and manage budgets with Octo Router.
Octo Router provides a transparent budgeting and usage tracking system to prevent unexpected costs and ensure predictable LLM spending.
Provider Budgets
Budgets are currently enforced at the provider level. When a provider exceeds its defined limit, Octo Router's "Bouncer" layer (the pipeline) automatically skips that provider and moves to the next candidate in the fallback chain.
Configuration
Define budgets in the limits.providers section of your config.yaml:
limits:
providers:
openai:
budget: 10.00 # Stop OpenAI requests after $10 cumulative cost
anthropic:
budget: 25.00
gemini:
budget: 50.00[!NOTE] Budgets are cumulative since the last "Reset". By default, limits are reset when the process restarts unless using Redis.
Storage Backends
Octo Router supports two ways to track and store usage data:
| Backend | Scope | Persistence | Recommended for |
|---|---|---|---|
| In-Memory | Single Instance | Lost on Reset | Local development, single-binary usage. |
| Redis | Multi-Instance | Persistent | Production, high-availability deployments. |
To use Redis, configure the redis section:
redis:
addr: "localhost:6379"
db: 0Tracking Costs
Every successful API response includes a usage block and a total cost_usd field, following the OpenAI metadata standard.
{
"message": "Hello! I can help with that.",
"role": "assistant",
"provider": "openai",
"usage": {
"prompt_tokens": 150,
"completion_tokens": 350,
"total_tokens": 500
},
"cost_usd": 0.00750
}Response Headers
Octo Router also exposes the cost of the request in the X-Request-Cost HTTP header for easy monitoring without parsing the JSON body.
Resetting Budgets
If you are using the Admin API, you can reset usage for a specific provider to resume traffic:
curl -X POST "http://localhost:8080/admin/budgets/reset?provider=openai" \
-H "Authorization: Bearer <your-admin-key>"