Platform · Signal Loop

Smarter
daily.

Every receipt is a label. Every correction is a signal. KairosRoute tunes the router on your own traffic daily, and a public eval suite has to sign off before any new version ships.

Get your API key Browse the eval suite →

01Route

Quality-gated routing.

Every request is read for what it actually needs. We pick the cheapest model that clears the bar — never a downgrade to save a penny.

See the router

02Observe

A receipt for every call.

Structured record of what the router saw, who it picked, what it cost. Streamed to the OTel stack you already run.

See the receipts

03ImproveYou are here

A router that gets smarter.

A daily learning pass tunes the router on your own traffic. Eval-gated before promotion. One flip to roll back.

Currently viewing

Static routers age badly.

Providers ship new revisions without renaming. Prices change. Your traffic mix drifts. What was optimal last month is quietly costing you quality, latency, or money today.

One week, one router

What “daily”
actually looks like.

Every row is a candidate version. Eval score gates promotion. Quality goes up, cost drifts down, and the two times the loop tried to ship a bad version, the gate held.

Day

Version

Eval

Cost / req

p50 ms

Status

Mon

v42

0.912

$0.0187

714

Shipped

Tue

v43

0.908

$0.0184

709

Held· eval regression

Wed

v44

0.918

$0.0181

702

Shipped

Thu

v45

0.921

$0.0179

694

Shipped

Fri

v46

0.935

$0.0175

688

Shipped· new gpt-5.4 promoted in reasoning

Sat

v47

0.933

$0.0173

687

Shipped

Sun

v48

0.812

$0.0169

692

Held· drift guard tripped

Illustrative. Real histories are per-workspace and visible in the dashboard.

Version inspector

Every version is auditable.
Rollback is one click.

Inspect exactly what changed between router versions — which models moved on which categories, why, and the eval delta that earned the promotion. If a new version misbehaves in production, flip back to the prior one without a redeploy.

Activerouter@v47promoted Sat 02:14 UTC · eval 0.933 · ship gate cleared

Versions

v48Sun

0.812

v47Sat

0.933

v46Fri

0.935

v45Thu

0.921

v44Wed

0.918

v43Tue

0.908

v42Mon

0.912

What changed · v46 → v47

reasoning

claude-opus-4-5→gpt-5.4

+0.021-8%

code

claude-sonnet-4-5→gpt-5.3-codex

+0.014+3%

extraction

gpt-4.1-mini→gemini-3-flash-preview

+0.002-12%

Eval suite by category

summarization

0.964

+0.002

extraction

0.946

+0.005

creative

0.891

+0.002

analysis

0.923

+0.006

code

0.921

+0.017

reasoning

0.912

+0.034

How the loop runs.

Nightly pass at 02:00 UTC. Every stage writes a heartbeat, every decision leaves a receipt, and the eval gate is a hard stop, not a warning.

Read your last 24 hours

The loop looks at how the router performed on real traffic — what it picked, what worked, what didn't. Provider incidents and outliers are excluded so the picture reflects steady-state traffic, not noise.

Take in feedback

Ops corrections, customer thumbs-down replays, and independent eval runs all feed the loop. Higher-trust signals (your team's corrections) outweigh lower-trust ones (unreviewed metrics).

Propose a candidate version

A new candidate routing version is built. Drift guards throw out anything that swings too hard in one pass — incidents and outlier days don't get to poison tomorrow.

Gate behind the eval suite

The candidate is scored against a public 40+ case eval. It only ships if it matches or beats the current version, within a small noise floor. Otherwise the old version stays active.

Ship + keep every prior version

Promoted versions become the default route. Every prior version is archived with its eval score and a one-flag rollback — no redeploy, no config change.

Safety rails on every pass.

Loops that learn in the open fail in the open. These are the guardrails that make the daily cadence safe to keep on.

Eval gate

No version ships unless it matches or beats the previous version on a public 40+ case suite.

Drift guard

Big score jumps get rejected. Catches outlier days and provider incidents before they poison the next fit.

One-query rollback

Every version is preserved with its eval score. Flip one flag to go back. No redeploy.

Source weighting

Ops corrections outweigh user feedback, which outweighs unreviewed eval signal.

No label guessing

Uncertain requests are never mined without a human-graded correction.

Heartbeats

Every stage reports on completion. A watchdog alerts if anything falls behind.

Let the router get better on your traffic.

Every call feeds the loop. Every version has to earn its promotion.

Get your API key Browse the eval suite →

Smarterdaily.

Quality-gated routing.

A receipt for every call.

A router that gets smarter.

Static routers age badly.

What “daily”actually looks like.

Every version is auditable.Rollback is one click.

How the loop runs.

Read your last 24 hours

Take in feedback

Propose a candidate version

Gate behind the eval suite

Ship + keep every prior version

Safety rails on every pass.

Eval gate

Drift guard

One-query rollback

Source weighting

No label guessing

Heartbeats

Let the router get better on your traffic.

Smarter
daily.

What “daily”
actually looks like.

Every version is auditable.
Rollback is one click.