Cheapest LLM APIs 2026: Low-Cost AI Models Ranked by Workload
Discover the cheapest LLM APIs in 2026: blended input/output cost, batch discounts, and prompt caching. Compare budget AI models for startups and agencies in the US, Canada, and Australia.
Lowest estimated monthly API cost for the same workload in 2026
The cheapest tab ranks models by estimated spend for your exact monthly requests and token pattern, including batch and cached pricing only when our database confirms eligibility. Founders and agencies in the United States, Canada, and Australia use it to protect margins on high-volume chat, summarization, and RAG pipelines without guessing list prices from blog posts.
Workload & pricing toggles
Same three scenarios as the main AI API calculator: moderate traffic, large RAG-style context, or per-request max tokens with a lower request count.
Include Vision / Image Processing
Off — no image fees in cost estimates for vision-capable models.
Turn On to include image fees.
Use Cached Pricing
Enable to get 50% off input tokens where cached rates apply
Deep Reasoning / Thinking Mode
Model hidden reasoning / extended thinking charged like output tokens when enabled.
Batch Pricing
Enable for 50% off input & output where batch/async pricing applies
Cached / batch est. monthly values only change after the pipeline sets supports_caching or supports_batch in Supabase. The toggles here narrow the table to models whose catalog or provider typically supports those modes.
Magic quadrant (top 15)
X: est. monthly · Y: Cheapest (est. monthly) · Dot: provider color · Hover for rank, model & detailsFull leaderboard
Showing 48 of 327 models.
| Pick | Model | Est. monthly | ROI score | Coding | Reasoning | Speed | Math | Context | Overall |
|---|---|---|---|---|---|---|---|---|---|
| Free Models Router | Free | 59 | 43 | 55 | 85 | 55 | 200K | 52 | |
| Pareto Code Router | VARIABLE | 74 | 88 | 73 | 85 | 80 | 200K | 78 | |
| Elephant | Free | 78 | 90 | 83 | 70 | 88 | 262K | 86 | |
| Google: Lyria 3 Pro Preview | Free | 30 | 0 | 0 | 0 | 0 | 1.0M | 0 | |
| Body Builder (beta) | VARIABLE | 30 | 0 | 0 | 0 | 0 | 128K | 0 | |
| Auto Router | VARIABLE | 77 | 84 | 83 | 70 | 86 | 2.0M | 84 | |
| Google: Lyria 3 Clip Preview | Free | 6 | 0 | 0 | 70 | 0 | 1.0M | 0 | |
| Mistral: Mistral Nemo | $0.70 | 9 | 0 | 0 | 0 | 0 | 131K | 0 | |
| Meta: Llama 3.1 8B Instruct | $1.30 | 59 | 0 | 34 | 85 | 85 | 16K | 38 | |
| Meta: Llama 3 8B Instruct | $1.60 | 68 | 62 | 69 | 85 | 30 | 8K | 58 | |
| IBM: Granite 4.0 Micro | $1.78 | 78 | 81 | 76 | 85 | 81 | 131K | 78 | |
| Sao10K: Llama 3 8B Lunaris | $2.10 | 6 | 0 | 0 | 0 | 0 | 8K | 0 | |
| LiquidAI: LFM2-24B-A2B | $2.40 | 43 | 0 | 44 | 97 | 0 | 33K | 22 | |
| Google: Gemma 3 4B | $2.40 | 6 | 0 | 0 | 0 | 0 | 131K | 0 | |
| OpenAI: gpt-oss-20b | $2.60 | 84 | 96 | 97 | 85 | 98 | 131K | 97 | |
| Qwen: Qwen2.5 7B Instruct | $2.60 | 6 | 0 | 0 | 0 | 0 | 33K | 0 | |
| Qwen: Qwen-Turbo | $2.60 | 6 | 0 | 0 | 0 | 0 | 131K | 0 | |
| Mistral: Mistral Small 3 | $2.80 | 6 | 0 | 0 | 0 | 0 | 33K | 0 | |
| Amazon: Nova Micro 1.0 | $2.80 | 72 | 69 | 82 | 95 | 69 | 128K | 76 | |
| Google: Gemma 3 12B | $2.90 | 6 | 0 | 0 | 0 | 0 | 131K | 0 | |
| Cohere: Command R7B (12-2024) | $3.00 | 6 | 0 | 0 | 0 | 0 | 128K | 0 | |
| MythoMax 13B | $3.00 | 6 | 0 | 0 | 0 | 0 | 4K | 0 | |
| Meta: Llama 3.2 1B Instruct | $3.08 | 6 | 0 | 0 | 0 | 0 | 60K | 0 | |
| NVIDIA: Nemotron Nano 9B V2 | $3.20 | 75 | 72 | 84 | 85 | 98 | 131K | 85 | |
| Arcee AI: Trinity Mini | $3.30 | 73 | 82 | 80 | 85 | 80 | 131K | 81 | |
| OpenAI: gpt-oss-120b | $3.46 | 39 | 0 | 45 | 55 | 0 | 131K | 23 | |
| Google: Gemma 3n 4B | $3.60 | 5 | 0 | 0 | 0 | 0 | 33K | 0 | |
| Qwen: Qwen3 235B A22B Instruct 2507 | $3.84 | 70 | 73 | 83 | 55 | 78 | 262K | 79 | |
| NVIDIA: Nemotron 3 Nano 30B A3B | $4.00 | 71 | 74 | 79 | 85 | 91 | 262K | 81 | |
| Microsoft: Phi 4 | $4.00 | 62 | 70 | 63 | 85 | 68 | 16K | 66 | |
| Qwen: Qwen3 14B | $4.80 | 59 | 52 | 72 | 70 | 58 | 41K | 63 | |
| Amazon: Nova Lite 1.0 | $4.80 | 5 | 0 | 0 | 0 | 0 | 300K | 0 | |
| Google: Gemma 3 27B | $4.80 | 5 | 0 | 0 | 0 | 0 | 131K | 0 | |
| Mistral: Ministral 3 3B 2512 | $5.00 | 31 | 0 | 28 | 85 | 0 | 131K | 14 | |
| Mistral: Mistral Small 3.2 24B | $5.00 | 69 | 92 | 83 | 85 | 69 | 128K | 82 | |
| Reka Edge | $5.00 | 5 | 0 | 0 | 0 | 0 | 16K | 0 | |
| Z.ai: GLM 4 32B | $5.00 | 5 | 0 | 0 | 0 | 0 | 128K | 0 | |
| Qwen: Qwen3.5-Flash | $5.20 | 23 | 0 | 0 | 90 | 0 | 1.0M | 0 | |
| Meta: Llama 3.2 3B Instruct | $5.44 | 4 | 0 | 0 | 0 | 0 | 80K | 0 | |
| Qwen: Qwen3.5-9B | $5.50 | 68 | 65 | 87 | 70 | 83 | 262K | 80 | |
| Qwen: Qwen3 Coder 30B A3B Instruct | $5.50 | 47 | 52 | 44 | 55 | 31 | 160K | 43 | |
| Qwen: Qwen3 32B | $5.60 | 73 | 85 | 89 | 60 | 92 | 41K | 89 | |
| Baidu: ERNIE 4.5 21B A3B | $5.60 | 72 | 85 | 89 | 60 | 87 | 120K | 88 | |
| Baidu: ERNIE 4.5 21B A3B Thinking | $5.60 | 4 | 0 | 0 | 0 | 0 | 131K | 0 | |
| Google: Gemma 4 26B A4B | $5.70 | 4 | 0 | 0 | 0 | 0 | 262K | 0 | |
| ByteDance Seed: Seed 1.6 Flash | $6.00 | 66 | 87 | 76 | 85 | 77 | 262K | 79 | |
| OpenAI: gpt-oss-safeguard-20b | $6.00 | 61 | 78 | 70 | 55 | 60 | 131K | 70 | |
| Google: Gemini 2.0 Flash Lite | $6.00 | 4 | 0 | 0 | 0 | 0 | 1.0M | 0 |
Need a shareable artifact?
Download a print-ready PDF from the leaderboard and workload above. No email step—lead capture is off.
PDF Breakdown
Receive a comprehensive native vector PDF of this leaderboard: your workload, filters, top rankings, and a table snapshot (sorted: Cheapest (est. monthly)).
By submitting, you agree to our Privacy Policy and Terms.
Whitelabel Est. monthly Leaderboard
for your site
Embed the interactive cheapest (est. monthly) view on your own domain — whitelabel branding, lead capture, and the same workload sliders your prospects already use on LeadsCalc.