Input vs output tokens

The two halves of an LLM bill, usually priced very differently.

Input tokens are what you send (prompt, context, retrieved docs); output tokens are what the model generates. Output is typically 3–5× the price of input, so output-heavy features cost far more than their token count suggests.

Why it matters

Because output usually costs several times more than input, the split decides where your money actually goes — and where to focus when you need to cut cost without hurting the product.

Example

On GPT-4o, 1M input tokens cost $2.50 while 1M output tokens cost $10 — so a feature that generates a lot is four times costlier than the same token count spent on input.

Calculate it with the free margin calculator →

Related terms