← All benchmarks
Router latency
How much latency KairosRoute adds versus calling the provider directly, measured with paired samples, not averages-of-averages.
No router-latency data yet. Run npx tsx scripts/bench-router-latency.ts with a broker key and provider key to populate this page. See methodology for the paired-sample design.
Classifier microbenchmark
The keyword-based task classifier runs synchronously on every broker request, this measures its in-process cost, independent of network.
| Percentile | Classifier latency |
|---|---|
| p50 | 34.3µs |
| p95 | 53.3µs |
| p99 | 76.8µs |
| mean | 37.3µs |
3,000 iterations × 7 representative prompts. Bottom line: classification cost is immaterial next to provider round-trip time.
Methodology
Full methodology, caveats, and reproduction steps live in docs/LATENCY_METHODOLOGY.md. Short version:
- Paired samples fired in alternating order (B/D, D/B, ...) to cancel time-of-day drift.
- 5 warmup requests per side before the timed batch (TLS handshake, Vercel cold start, module load).
- Non-streaming short completions (max_tokens: 8) so generation time doesn't wash out the overhead signal.
- Failed pairs are dropped whole, never half-paired, which would desync the delta math.
- Overhead = per-sample (broker − direct), then percentiled. Never percentile(broker) − percentile(direct).