Smarter
daily.
Every receipt is a label. Every correction is a signal. KairosRoute tunes the router on your own traffic daily, and a public eval suite has to sign off before any new version ships.
Quality-gated routing.
Every request is read for what it actually needs. We pick the cheapest model that clears the bar — never a downgrade to save a penny.
See the routerA receipt for every call.
Structured record of what the router saw, who it picked, what it cost. Streamed to the OTel stack you already run.
See the receiptsA router that gets smarter.
A daily learning pass tunes the router on your own traffic. Eval-gated before promotion. One flip to roll back.
Currently viewingStatic routers age badly.
Providers ship new revisions without renaming. Prices change. Your traffic mix drifts. What was optimal last month is quietly costing you quality, latency, or money today.
What “daily”
actually looks like.
Every row is a candidate version. Eval score gates promotion. Quality goes up, cost drifts down, and the two times the loop tried to ship a bad version, the gate held.
v42v43v44v45v46v47v48Illustrative. Real histories are per-workspace and visible in the dashboard.
Every version is auditable.
Rollback is one click.
Inspect exactly what changed between router versions — which models moved on which categories, why, and the eval delta that earned the promotion. If a new version misbehaves in production, flip back to the prior one without a redeploy.
router@v47promoted Sat 02:14 UTC · eval 0.933 · ship gate clearedv48Sunv47Satv46Friv45Thuv44Wedv43Tuev42Monreasoningclaude-opus-4-5→gpt-5.4codeclaude-sonnet-4-5→gpt-5.3-codexextractiongpt-4.1-mini→gemini-3-flash-previewsummarizationextractioncreativeanalysiscodereasoningHow the loop runs.
Nightly pass at 02:00 UTC. Every stage writes a heartbeat, every decision leaves a receipt, and the eval gate is a hard stop, not a warning.
Read your last 24 hours
The loop looks at how the router performed on real traffic — what it picked, what worked, what didn't. Provider incidents and outliers are excluded so the picture reflects steady-state traffic, not noise.
Take in feedback
Ops corrections, customer thumbs-down replays, and independent eval runs all feed the loop. Higher-trust signals (your team's corrections) outweigh lower-trust ones (unreviewed metrics).
Propose a candidate version
A new candidate routing version is built. Drift guards throw out anything that swings too hard in one pass — incidents and outlier days don't get to poison tomorrow.
Gate behind the eval suite
The candidate is scored against a public 40+ case eval. It only ships if it matches or beats the current version, within a small noise floor. Otherwise the old version stays active.
Ship + keep every prior version
Promoted versions become the default route. Every prior version is archived with its eval score and a one-flag rollback — no redeploy, no config change.
Safety rails on every pass.
Loops that learn in the open fail in the open. These are the guardrails that make the daily cadence safe to keep on.
Eval gate
No version ships unless it matches or beats the previous version on a public 40+ case suite.
Drift guard
Big score jumps get rejected. Catches outlier days and provider incidents before they poison the next fit.
One-query rollback
Every version is preserved with its eval score. Flip one flag to go back. No redeploy.
Source weighting
Ops corrections outweigh user feedback, which outweighs unreviewed eval signal.
No label guessing
Uncertain requests are never mined without a human-graded correction.
Heartbeats
Every stage reports on completion. A watchdog alerts if anything falls behind.
Let the router get better on your traffic.
Every call feeds the loop. Every version has to earn its promotion.