Octo Router logoOctoRouter
Routing

Latency-based Routing

Route requests to the fastest provider based on real-time performance tracking.

Latency-based Routing

Latency-based routing is a provider-centric strategy that automatically selects the fastest provider in your network. It maintains a rolling average of response times to ensure your users always get the snappiest experience.

Configuration

To enable latency-based routing, set your strategy to latency-based.

routing:
  strategy: "latency-based"

How it Works

The router uses a internal Scoring System to evaluate providers:

  1. Scoring: Octo Router tracks the Time to First Byte (TTFB) and total response time for every request across all providers.
  2. The "Best" Choice: When a request comes in, the router selects the provider with the lowest current latency score.
  3. Exploration Phase: To prevent "stale" data, providers with no recent history (Score = 0) are prioritized and selected randomly to refresh their performance metrics.

Interaction with Semantic Policies

Latency-based routing works in tandem with Semantic Policies:

  1. Filtering: The semantic policy first filters the list of providers based on the user's intent or required capabilities.
  2. Latency Selection: The router then picks the fastest provider from that filtered subset.

Summary: Latency-based vs Others

FeatureLatency-BasedWeighted StrategyCost-Based
Primary UnitProviderProviderModel
LogicFastest ResponseTraffic distribution (%)Minimum cost (USD)
Model ChoiceUses provider's defaultUses provider's defaultPicks best in catalog
Best ForReal-time apps & UXLoad balancingCost optimization

On this page