The Indie Hacker AI Stack: Under $100/mo Until You Have Revenue

I have shipped four AI-powered side projects in the last eighteen months. Two died, one broke even, one turned into actual revenue. The single highest-leverage decision across all four was not the model, the framework, or the hosting. It was refusing to spend real money on infrastructure until a real user was paying me. This post is the stack that made that possible.

The constraint: under $100/mo in total AI + infra spend, covering development, beta, and the first 50 paying users. Everything below assumes a solo dev shipping from a laptop on weekends.

The principles

BYOK where possible. Bring your own provider key, pay providers direct, never let anyone take a markup on your core cost.
Free tiers are load-bearing. Use them until you are forced off. Most indie projects never get forced off.
One dashboard or you are flying blind. You cannot afford to debug cost spikes by reading six different provider invoices at 11pm.
Cache aggressively. Half of your "AI" is probably the same five prompts every hour.
Route, do not commit. Never hard-code a model name in application code. Always go through a gateway.

The stack, component by component

Layer	Tool	Cost at 0 users	Cost at 50 users	Why
Hosting	Vercel (Hobby)	$0	$0	Ships a Next.js app free until you cross the generous free tier
Database	Supabase (Free)	$0	$0	500MB, auth included; plenty for 50 users
Auth	Supabase Auth	$0	$0	Bundled with DB
Payments	Stripe	$0 base	~2.9%+$0.30/txn	No monthly minimum
LLM routing	KairosRoute (Free)	$0	$0	100K tokens BYOK + $5 managed-key trial
LLM providers	BYOK to Anthropic/OpenAI/Groq	$0	$8-30	Pay providers directly, no markup
Prompt cache	Built into the router	$0	$0	Exact-match + semantic on request
Error tracking	Sentry (Developer)	$0	$0	5K events/mo is enough
Analytics	PostHog (Free)	$0	$0	1M events/mo free
Email	Resend (Free)	$0	$0	3K sends/mo
Total	—	$0	$8-30	—

The trick is that every layer has a free tier that is sized for roughly the first 50-200 paying users. You cross a paid tier one component at a time, only when the revenue it unlocks is bigger than the bill.

Why the router matters even at zero users

The obvious reason to route is cost. The less obvious reason — and the reason I use a router on day one of every project — is that your model choices will change and you want the switch to be a config change, not a refactor. Six months into my last project I went from Claude 3.5 Sonnet to Claude Sonnet 4.5 to a kr-auto policy that picks between Sonnet, Gemini Flash, and DeepSeek depending on task. Without a gateway, that is three weekends of code changes. With one, it is three config PRs.

The second less-obvious reason: usage dashboards and receipts. When a user emails you at 9pm saying "it got slow," you need to be able to see which model served their request, what the latency was, and what fallback triggered if any. Without a gateway dashboard, that is a grep through logs. With one, it is one page.

The BYOK math

BYOK means you register your own Anthropic, OpenAI, and Groq keys with the gateway. The gateway routes traffic to them; the providers bill you directly. The gateway takes zero markup on the provider cost — only a small per-token gateway fee, which on the Free tier is waived up to 100K tokens/month.

For a typical indie SaaS with 50 users doing ~5 AI actions per user per day, at an average 2K tokens per action, that is 50 * 5 * 30 * 2000 = 15M tokens/month. BYOK to Groq Llama 3.1 8B at roughly $0.08/1M blended: ~$1.20/mo. BYOK to Haiku: ~$10/mo. BYOK to Sonnet: ~$45/mo. Mix those through kr-auto and the blended bill lands around $8-15/mo.

The minimum viable caching layer

The single most effective thing I did on my last project was turn on exact-match prompt caching. Here is the thing nobody admits: indie SaaS users ask the same questions over and over. On a recipe app, "what can I make with chicken and rice" gets asked 200x a week with zero variation. On a resume tool, "rewrite this to be more concise" is every third request. Exact-match caching collapsed my token bill by 38% on week one of enabling it.

You do not need to build this. If you are routing through a gateway, it is a toggle. If you are not, you are writing a Redis wrapper around your LLM client, and you will get it wrong the first three times.

The config I actually use

python

from openai import OpenAI

client = OpenAI(
    api_key=os.environ["KAIROSROUTE_API_KEY"],
    base_url="https://api.kairosroute.com/v1",
)

def ask(prompt: str, quality: str = "balanced") -> str:
    resp = client.chat.completions.create(
        model="auto",
        messages=[{"role": "user", "content": prompt}],
        extra_body={
            "kr_quality": quality,      # balanced | fast | best
            "kr_cache": True,           # exact-match cache
            "kr_max_cost_usd": 0.01,    # per-request budget ceiling
        },
    )
    return resp.choices[0].message.content

That is the whole AI layer. Model selection, caching, and per-request budget caps in one function. If the quality gets worse, I bump quality to best. If a request is generating a runaway cost, the ceiling kicks in and the gateway returns a graceful fallback. I have not written a retry loop or a fallback chain in my own code in over a year.

What happens when you cross 50 users

You do not need to replatform. You need to:

Flip to the Team tier ($99/mo) when your gateway overage + quality-of-life wants exceed that number.
Turn on the hard-cap toggle on your API key once you have confidence in your traffic pattern, so a runaway can never cost you more than a fixed amount.
Move Supabase to Pro ($25/mo) when you cross 500MB or need point-in-time recovery.
Keep everything else free.

Total infra + AI bill at 200 paying users on a $19/mo product: typically under $200/mo, on about $3,800 MRR. That is the 95% gross margin math that makes indie SaaS work.

What I would skip on day one

Vector databases. Start with pgvector in Supabase. You will not hit its limits for months.
Orchestration frameworks. LangChain and LlamaIndex are not indie-hacker tools. Write the 40-line dispatcher yourself.
Your own eval harness. Use the gateway's built-in shadow eval when you have traffic. Before that, you have nothing to measure.
Self-hosted anything. You do not have the time. Pay for managed or use a free tier.

The throughline

A one-person SaaS wins on three things: speed, focus, and cost discipline. The router is the wedge that gives you cost discipline from day zero without adding complexity — one SDK, one dashboard, one bill. When your product starts working, you do not have to rebuild anything. You just flip a tier.

If you want the fastest possible start, the playground lets you send your first prompt and see the routing decision and cost in under two minutes. The Free tier is enough to ship your whole MVP. Come back when someone is paying you.

Already shipping? Read the founder cost-cutting playbook for the weekend-sized project that takes you from "rough" to "defensible."

Ready to route smarter?

KairosRoute gives you a single OpenAI-compatible endpoint that routes every request to the cheapest model meeting your quality bar — plus the observability, A/B testing, and cost analytics that turn cheaper infrastructure into a durable margin.

Calculate Your Savings Start Free — 100K tokens