Requests / month

How many model calls per month.

Cached input %

0% of input tokens reused (system prompts, RAG context). Major savings on OpenAI / Anthropic / Gemini.

Avg input tokens / request

~750 tokens ≈ 1 page of text.

Avg output tokens / request

Output is typically 3-6× more expensive than input.

Cheapest$18.00/moGPT-4o mini

Most expensive$2,100/moClaude Opus 4

Spread116.7×cheapest → priciest

Model	Vendor	Input	Output	Total / mo
GPT-4o mini	OpenAI	$6.000	$12.00	$18.00
DeepSeek V3	DeepSeek	$10.80	$22.00	$32.80
Llama 3.1 70B	Open / Hosted	$23.60	$15.80	$39.40
Gemini 2.5 Flash	Google	$12.00	$50.00	$62.00
Claude Haiku 4	Anthropic	$32.00	$80.00	$112.00
Mistral Large	Mistral	$80.00	$120.00	$200.00
GPT-4.1	OpenAI	$80.00	$160.00	$240.00
Gemini 2.5 Pro	Google	$50.00	$200.00	$250.00
GPT-4o	OpenAI	$100.00	$200.00	$300.00
Claude Sonnet 4	Anthropic	$120.00	$300.00	$420.00
o3 (reasoning)	OpenAI	$400.00	$800.00	$1,200
Claude Opus 4	Anthropic	$600.00	$1,500	$2,100

Prices reflect public list prices and change frequently. Volume discounts, batch APIs (50% off on OpenAI/Anthropic), and prompt caching can drop your real cost meaningfully.

AI Cost Calculator

Free tool today. Custom AI on Friday.