AI SaaS gross margin: the complete guide

Flat subscriptions, variable token costs: why AI products break the usual SaaS margin math — and how to get it right, customer by customer.

9 min read · Updated 2026-06-16

Classic SaaS has a beautiful property: once the software is built, serving one more customer costs almost nothing. Gross margins of 75–85% are normal, and your average margin is a fair summary of the business. AI products break that assumption. Every request burns LLM tokens you pay for, so each customer carries a real, variable cost — and on a flat monthly price, that cost varies wildly from one customer to the next.

This guide walks through how gross margin actually works for an AI SaaS: how to define and calculate it per customer, what a healthy number looks like, where your LLM bill comes from, why your average hides the truth, and how to keep margin from slipping as you grow.

Why AI SaaS margins are different

On a flat $49/mo plan, a light user who sends a few requests a week might cost you $3 in tokens, while a power user running an agent all day costs $60. Same price, wildly different cost. In traditional SaaS the second customer is still profitable; in AI SaaS they can pay you $49 and cost you more than that. Your revenue per customer is fixed, but your cost per customer is not — and that gap is where AI margins live or die.

Industry surveys back this up: where mature software targets 70–80% gross margin, AI-heavy products often run well below that — sometimes near 50% — and a large share of AI companies don't track their LLM cost per customer at all. That combination (lower margins, less visibility) is exactly how a growing product can scale itself into losses.

Gross margin per customer, defined

Gross margin per customer is simply what a customer pays you minus what that customer costs you to serve, over the same period. For an AI product the dominant variable cost is LLM tokens, so: margin = subscription revenue − LLM cost (plus any other usage-based costs like vector search or tools). Express it as a percentage of revenue and you can compare customers and plans on equal footing.

Gross margin per customer = plan price − (input tokens × input rate + output tokens × output rate). A customer on a $49 plan costing $12 in tokens has a 76% margin; one costing $61 is at −24%.

What counts as a good gross margin?

There's no single number, but useful anchors: 70%+ is healthy and gives you room to fund sales, support and R&D; 50–70% is workable if your pricing and growth are efficient; below 50% means the model itself is eating your business and pricing or model choice needs to change. The catch is that these are blended numbers — and for AI products the blend hides more than it reveals.

Where your LLM cost comes from

Providers bill per million tokens (per Mtok), with separate rates for input (what you send: prompts, context, retrieved documents) and output (what the model generates). Output is typically 3–5× the price of input, so generation-heavy features cost far more than their token counts suggest. The model you pick sets the rate: a cheap model like GPT-4o mini or DeepSeek leaves huge headroom, while a reasoning model can cost 10–50× more for the same task.

See the cost & margin for each model

Why your average margin lies

Blended margin — total revenue minus total cost — is the number on most dashboards. It's also the most dangerous one for an AI product, because a healthy-looking 65% blend can hide a tail of customers who are individually unprofitable. As you grow, that tail grows too: more signups don't dilute the problem, they multiply it. The only way to see it is to compute margin per customer and look at the worst ones, not the average.

This is the single most common AI-SaaS pricing mistake: founders watch revenue (which Stripe shows them) and never see cost per customer (which Stripe can't show them). By the time the model bill spikes, several of your 'best' accounts have been losing you money for months.

How to reduce LLM cost per customer

Once you can see per-customer cost, the levers are concrete: trim and cache input context, cap or stop output length, route easy requests to a cheaper model and reserve the expensive one for hard tasks, and cache or deduplicate repeated calls. Each lever attacks the part of the bill it touches — and because output is the priciest part, controlling generation usually pays back fastest.

How to track it

Tracking gross margin per customer means joining two data sources you already have: your revenue (Stripe) and your LLM cost (Langfuse, OpenRouter, or your own usage logs). Matched per customer, they give you margin per account, per plan, and a blended view that no longer hides the losers. Set an alert when a customer crosses into the red and you catch the problem while it's still a pricing tweak, not a crisis.

MarginWard does exactly this join — read-only Stripe key plus your LLM cost source — and flags unprofitable customers automatically. Or sanity-check a single customer first with the free calculator, no signup.

Related