Quality & Cost Guardrails
Control routing behavior per API key. Set minimum quality thresholds and maximum cost caps so the router never picks a model that is too cheap or too expensive for your use case.
How It Works
You set guardrails on an API key
Either through the dashboard UI or the REST API.
Every request validates against them
When a request comes in, the router loads your guardrails from the database alongside your API key.
Routing respects your limits
The router filters out models below your quality threshold and rejects requests that would exceed your cost cap.
Guardrail Parameters
| Parameter | Type | Description |
|---|---|---|
| minQualityThreshold | number | null | 0.0 β 1.0. Minimum model quality score. The router will not pick any model scoring below this. Set to null to use smart per-task defaults (e.g., 0.85 for reasoning, 0.6 for summarization). |
| maxCostPerRequestUsd | number | null | Positive USD amount. If the estimated cost of a routed request exceeds this, the API returns a COST_LIMIT_EXCEEDED error instead of burning credits. Set to null for no limit. |
Tip: For most use cases, leave quality blank and the router will automatically pick the right quality tier per task. Only set an explicit threshold if you need to guarantee premium models (e.g., 0.9+) for every request regardless of task complexity.
Option 1: Configure in the Dashboard
- 1Go to Dashboard β API Keys
- 2Find the key you want to configure
- 3Click the "Guardrails" accordion on the key card
- 4Set Min Quality Score and/or Max Cost per Request
- 5Click "Save Guardrails"
A gold dot appears next to "Guardrails" on any key that has active guardrails configured.
Option 2: Configure via the REST API
Update guardrails programmatically with a PATCH request. You need your session token from the dashboard.
Set guardrails
import requests
# Set guardrails on a specific API key
response = requests.patch(
"https://kairosroute.com/api/auth",
params={"keyId": "your-key-uuid"},
headers={
"Authorization": "Bearer YOUR_SESSION_TOKEN",
"Content-Type": "application/json",
},
json={
"minQualityThreshold": 0.8, # Only models scoring 0.8+
"maxCostPerRequestUsd": 0.05, # Reject if estimated > $0.05
},
)
print(response.json())
# {"message": "Guardrails updated", "updated": {...}}Clear guardrails
Pass null to remove a guardrail and revert to smart defaults.
response = requests.patch(
"https://kairosroute.com/api/auth",
params={"keyId": "your-key-uuid"},
headers={
"Authorization": "Bearer YOUR_SESSION_TOKEN",
"Content-Type": "application/json",
},
json={
"minQualityThreshold": None, # Revert to auto
"maxCostPerRequestUsd": None, # Remove cost cap
},
)Using Guardrails with the OpenAI SDK
Guardrails are enforced server-side based on the API key β you don't need to change your request code at all. Just use the key that has guardrails configured.
import openai
# This key has guardrails: min_quality=0.8, max_cost=$0.05
client = openai.OpenAI(
api_key="kr_live_xxxxx",
base_url="https://api.kairosroute.com/v1",
)
# Normal request β guardrails enforced automatically
try:
response = client.chat.completions.create(
model="kr-auto",
messages=[{"role": "user", "content": "Explain quantum computing"}],
)
print(response.choices[0].message.content)
except openai.APIStatusError as e:
if e.status_code == 429 and "COST_LIMIT_EXCEEDED" in str(e.body):
print("Request too expensive for this key's cost cap")
else:
raiseAgent Framework Examples
Since guardrails are enforced per-key on the server, they work with every framework out of the box. Here are patterns for common agent frameworks.
LangChain
from langchain_openai import ChatOpenAI
# Key with guardrails: quality >= 0.8, cost <= $0.05/request
llm = ChatOpenAI(
model="kr-auto",
api_key="kr_live_xxxxx",
base_url="https://api.kairosroute.com/v1",
)
# All chains, agents, and tools using this LLM
# automatically respect the key's guardrails
result = llm.invoke("Summarize this document...")
print(result.content)CrewAI
import os
os.environ["OPENAI_API_KEY"] = "kr_live_xxxxx"
os.environ["OPENAI_API_BASE"] = "https://api.kairosroute.com/v1"
from crewai import Agent, Task, Crew
# Guardrails apply to every LLM call this crew makes
researcher = Agent(
role="Senior Researcher",
goal="Find accurate data",
backstory="Expert analyst",
llm="kr-auto",
)
crew = Crew(agents=[researcher], tasks=[...])
result = crew.kickoff()Vercel AI SDK
import { createOpenAI } from "@ai-sdk/openai";
import { generateText } from "ai";
const kairos = createOpenAI({
apiKey: "kr_live_xxxxx",
baseURL: "https://api.kairosroute.com/v1",
});
// Guardrails enforced server-side on every call
const { text } = await generateText({
model: kairos("kr-auto"),
prompt: "What is the meaning of life?",
});Error Responses
When a guardrail blocks a request, you get a structured error:
| HTTP Status | Error Code | When |
|---|---|---|
| 429 | COST_LIMIT_EXCEEDED | Estimated cost exceeds maxCostPerRequestUsd |
| 404 | NO_SUITABLE_MODEL | No model meets the quality threshold for this task |
// Example error response
{
"error": {
"message": "Estimated cost $0.12 exceeds limit $0.05",
"code": "COST_LIMIT_EXCEEDED",
"type": "guardrail_error"
}
}Best Practices
Use separate keys per environment
Create one key for development (lower cost cap) and another for production (higher cap or no limit). This prevents runaway costs during testing.
Start with cost caps, add quality later
A max cost cap of $0.10/request catches most expensive surprises. Only add a quality floor if you notice the router picking models that are too simple for your use case.
Handle COST_LIMIT_EXCEEDED gracefully
Instead of crashing, catch the 429 error and fall back to a cheaper model pin, retry with a shorter prompt, or alert the user.
Use null for smart defaults
The router already picks quality-appropriate models per task type. A quality threshold of null is usually better than picking a fixed number.
Per-Task Guardrails
With any credit balance, you can set different quality and cost thresholds for each task category. This lets you spend more on code generation while keeping summarization cheap.
| Category | Default Quality | Typical Use |
|---|---|---|
| reasoning | 0.85 | Math, logic, multi-step analysis |
| code | 0.80 | Code generation, debugging, refactoring |
| analysis | 0.75 | Data interpretation, comparison |
| extraction | 0.70 | Pulling structured data from text |
| summarization | 0.60 | TL;DR, condensing content |
| creative | 0.60 | Writing, brainstorming, storytelling |
Set via API
response = requests.patch(
"https://kairosroute.com/api/auth",
params={"keyId": "your-key-uuid"},
headers={
"Authorization": "Bearer YOUR_SESSION_TOKEN",
"Content-Type": "application/json",
},
json={
# Global defaults
"minQualityThreshold": 0.7,
"maxCostPerRequestUsd": 0.10,
# Per-task overrides
"promptTypeGuardrails": {
"code": {"minQuality": 0.9, "maxCostUsd": 0.20},
"summarization": {"minQuality": 0.5, "maxCostUsd": 0.01},
"reasoning": {"minQuality": 0.85},
},
},
)How overrides work: When a request is classified as βcodeβ, the router checks for a code-specific guardrail first. If none is set, it falls through to the global guardrails. If those are also unset, smart defaults apply.
Per-Task Provider Filtering
You can also set provider allow/exclude lists per task category. For example, you might want to only use OpenAI and Anthropic for code generation, but allow all providers for summarization.
response = requests.patch(
"https://kairosroute.com/api/auth",
params={"keyId": "your-key-uuid"},
headers={
"Authorization": "Bearer YOUR_SESSION_TOKEN",
"Content-Type": "application/json",
},
json={
"promptTypeGuardrails": {
"code": {
"minQuality": 0.9,
# Only use OpenAI and Anthropic for code
"allowedProviders": ["OpenAI", "Anthropic"],
},
"summarization": {
"minQuality": 0.5,
# Exclude DeepSeek for summarization
"excludedProviders": ["DeepSeek"],
},
},
},
)Priority chain: Per-task provider filters override global provider filters for that task category. For example, if you globally exclude DeepSeek but your βsummarizationβ category has no exclusions, DeepSeek will be excluded for summarization too (global applies). But if you set allowedProviders on a specific task, only those providers are used for that task β regardless of global settings.
Provider Filtering
Control which AI providers the router is allowed to use. This is useful for compliance requirements, data residency, or if you simply prefer certain providers over others. Available to all paid users.
| Mode | Parameter | Behavior |
|---|---|---|
| Allow List | allowedProviders | Only route to these providers. All others are blocked. |
| Block List | excludedProviders | Route to any provider except these. If both are set, the allow list takes precedence. |
Available providers: OpenAI, Anthropic, Google, Mistral, Groq, DeepSeek, Fireworks, Together AI
Allow only specific providers
response = requests.patch(
"https://kairosroute.com/api/auth",
params={"keyId": "your-key-uuid"},
headers={
"Authorization": "Bearer YOUR_SESSION_TOKEN",
"Content-Type": "application/json",
},
json={
# Only use OpenAI and Anthropic models
"allowedProviders": ["OpenAI", "Anthropic"],
},
)Exclude specific providers
response = requests.patch(
"https://kairosroute.com/api/auth",
params={"keyId": "your-key-uuid"},
headers={
"Authorization": "Bearer YOUR_SESSION_TOKEN",
"Content-Type": "application/json",
},
json={
# Use all providers except DeepSeek
"excludedProviders": ["DeepSeek"],
},
)Clear provider filters
response = requests.patch(
"https://kairosroute.com/api/auth",
params={"keyId": "your-key-uuid"},
headers={
"Authorization": "Bearer YOUR_SESSION_TOKEN",
"Content-Type": "application/json",
},
json={
"allowedProviders": None, # Remove allow list
"excludedProviders": None, # Remove block list
},
)Precedence: If both allowedProviders and excludedProviders are set, the allow list wins β only providers in the allow list will be used, regardless of the exclude list. Provider filters apply on top of quality and cost guardrails.
Feature Availability
| Feature | Requirement |
|---|---|
| Intelligent routing (kr-auto) | Free |
| All model providers | Free |
| Usage dashboard | Free |
| Unlimited RPM | Any credit balance |
| Global quality & cost guardrails | Any credit balance |
| Per-task-type guardrails | Any credit balance |
| Provider filtering (allow/exclude) | Any credit balance |
| BYOK (Bring Your Own Key) | Any credit balance |
| Model variants (:free, :thinking, :extended) | Any credit balance |
| Advanced analytics | Any credit balance |
Ready to set guardrails?
Configure quality and cost limits in your dashboard right now.
Go to API Keys