Octo Router logoOctoRouter

Cost Management

Learn how to control costs and manage budgets with Octo Router.

Octo Router provides a transparent budgeting and usage tracking system to prevent unexpected costs and ensure predictable LLM spending.

Provider Budgets

Budgets are currently enforced at the provider level. When a provider exceeds its defined limit, Octo Router's "Bouncer" layer (the pipeline) automatically skips that provider and moves to the next candidate in the fallback chain.

Configuration

Define budgets in the limits.providers section of your config.yaml:

limits:
  providers:
    openai:
      budget: 10.00  # Stop OpenAI requests after $10 cumulative cost
    anthropic:
      budget: 25.00
    gemini:
      budget: 50.00

[!NOTE] Budgets are cumulative since the last "Reset". By default, limits are reset when the process restarts unless using Redis.

Storage Backends

Octo Router supports two ways to track and store usage data:

BackendScopePersistenceRecommended for
In-MemorySingle InstanceLost on ResetLocal development, single-binary usage.
RedisMulti-InstancePersistentProduction, high-availability deployments.

To use Redis, configure the redis section:

redis:
  addr: "localhost:6379"
  db: 0

Tracking Costs

Every successful API response includes a usage block and a total cost_usd field, following the OpenAI metadata standard.

{
  "message": "Hello! I can help with that.",
  "role": "assistant",
  "provider": "openai",
  "usage": {
    "prompt_tokens": 150,
    "completion_tokens": 350,
    "total_tokens": 500
  },
  "cost_usd": 0.00750
}

Response Headers

Octo Router also exposes the cost of the request in the X-Request-Cost HTTP header for easy monitoring without parsing the JSON body.

Resetting Budgets

If you are using the Admin API, you can reset usage for a specific provider to resume traffic:

curl -X POST "http://localhost:8080/admin/budgets/reset?provider=openai" \
     -H "Authorization: Bearer <your-admin-key>"

On this page