Configuration
Detailed guide on how to configure Octo Router using yaml.
Octo Router is configured via a config.yaml file located in the root directory. This section provides an overview of the config.yaml file and it's fields.
Providers
Define your LLM providers and their API keys. You can use environment variables using the ${VAR} syntax.
providers:
- name: openai
apiKey: ${OPENAI_API_KEY}
enabled: true
Models
The config has a model section to define the default models for each providers set up in the provider section
models:
defaults:
openai:
model: "openai/gpt-4o-mini"
maxTokens: 4096
anthropic:
model: "anthropic/claude-haiku-3"
maxTokens: 4096
gemini:
model: "gemini/gemini-2.5-flash-lite"
maxTokens: 4096If these defaults are not provided then a default model is selected by OctoRouter automatically. The default models are needed for certain routing strategies like round-robin and most especially if the semantic routing is disabled.
Routing Strategies
Choose how requests are distributed when multiple providers are available.
weighted: Distribute traffic based on defined weights.cost-based: Route to the cheapest model that meets your requirements.latency-based: Route to the fastest responding provider.round-robin: Distributes request across multiple providers equally.
routing:
strategy: "weighted"
weights:
openai: 70
anthropic: 30Semantic Routing
Semantic routing allows you to route requests based on user intent. When in embedding mode, It uses a local ONNX model to classify prompts into "intent groups".
routing:
policies:
semantic:
enabled: true
engine: "embedding"
threshold: 0.45
model_path: "assets/models/embedding.onnx"
groups:
- name: "coding"
required_capability: "code-gen"
allow_providers: ["openai"]Limits
Manage costs by setting daily budgets per provider or globally.
limits:
dailyBudget: 50.00
providers:
openai:
budget: 10.00Resilience
Manage timeouts, retries and circuit breakers
resilience:
timeout: 30000 # 30 second timeout
retries:
maxAttempts: 3
initialDelay: 1000 # 1 second
maxDelay: 10000 # 10 seconds
backoffMultiplier: 2 # Exponential backoff
circuitBreaker:
failureThreshold: 5 # Open after 5 failures
resetTimeout: 60000