Octo Router logoOctoRouter
Routing

Cost-based Routing

Optimize for savings by automatically selecting the cheapest model across providers.

Cost-based Routing

Cost-based routing is a model-centric strategy. Unlike Weighted or Round-Robin, which select providers, this strategy scans your entire models: catalog to find the specific model with the lowest per-token cost that satisfies your requirements.

Configuration

To enable cost-based routing, set your strategy to cost-based and define your costOptions.

routing:
  strategy: "cost-based"
  
  costOptions:
    defaultTier: "premium"      # Standard tier to use
    minimumTier: "budget"       # Never go below this tier
    tierStrategy: "same-tier"    # "same-tier", "allow-downgrade", or "cheapest"

Cost Options

OptionDescription
defaultTierThe target tier (budget, standard, premium, ultra-premium).
minimumTierA safety floor; Octo Router will never route to a model below this quality level.
tierStrategysame-tier: only models in the default tier. allow-downgrade: cheaper models if available. cheapest: ignore tiers, pick absolute lowest cost.

Model Selection Logic

  1. Catalog Scan: The router looks up every model defined in your catalog.
  2. Token Estimation: It estimates the cost of your current prompt using the specific tokenizers for each provider.
  3. Price Comparison: It calculates the USD cost for the estimated tokens and selects the cheapest model that matches your tier constraints.

Interaction with Semantic Policies

Cost-based routing integrates seamlessly with Semantic Policies.

  1. Filtering: If an intent matches (e.g., "coding"), the policy filters the catalog to only include models from allow_providers or models with the required_capability.
  2. Cost Selection: The router then picks the cheapest remaining model from that filtered list.

Summary: Cost-based vs Weighted

FeatureCost-Based StrategyWeighted Strategy
Primary UnitModelProvider
LogicMinimum cost (USD)Traffic distribution (%)
SelectionPicks best in catalogUses provider's default
Best ForCost optimizationLoad balancing & testing

On this page