The “which LLM provider?” question is hot in every kickoff. The honest answer in 2026: pick by workload, hedge with a gateway, expect to use multiple. This post is the practical comparison.
Models worth knowing
| Provider | Best for | |
|---|---|---|
| Claude Opus 4.7 | Anthropic | Coding, agents, reasoning, long context |
| Claude Sonnet 4.6 | Anthropic | Workhorse — RAG, tool use, chat |
| Claude Haiku 4.5 | Anthropic | Classification, extraction, high-volume |
| GPT-5 | OpenAI | Multimodal, tool use, frontier reasoning |
| GPT-5 mini | OpenAI | Fast, cheap workhorse |
| GPT-5 nano | OpenAI | Classification, embeddings, very high volume |
| Gemini 2.5 Pro | Long context (1M+), video, multimodal | |
| Gemini 2.5 Flash | Fast, cheap, capable |
These shift quarter-to-quarter. Run evals on your data; pick by results.
By workload
Coding agents
Anthropic Claude Opus 4.7 wins consistently. The agentic loops, tool calling, and code reasoning are noticeably better. Used by Cursor, Claude Code, Cognition, Codeium.
See Cursor vs Windsurf vs Claude Code .
RAG and knowledge Q&A
Claude Sonnet 4.6 or GPT-5 mini for cost-effectiveness; Opus 4.7 when reasoning over retrieved context matters. Both cache prompts well; both ship structured outputs.
Multimodal (images, video)
GPT-5 for general images. Gemini 2.5 Pro for video and very long documents.
Long context (>500k tokens)
Gemini 2.5 Pro (2M token context — though attention quality drops past 500k). Claude Opus 4.7 at 1M.
For 1M-token context patterns .
Voice agents
Realtime APIs from each provider are competitive. OpenAI’s Realtime is slightly more mature; Gemini Live wins on multimodal-with-video. See Voice Agents and Realtime LLM APIs .
Classification / extraction (high volume)
Claude Haiku 4.5 or GPT-5 nano. Both cheap; both do structured output well; pick by which is faster on your eval.
Edge / on-device
None. Use self-hosted Llama 3.3 / Qwen 2.5 .
Pricing (rough 2026)
Per 1M tokens (input / output):
| Input | Output | |
|---|---|---|
| Claude Opus 4.7 | $15 | $75 |
| Claude Sonnet 4.6 | $3 | $15 |
| Claude Haiku 4.5 | $1 | $5 |
| GPT-5 | $20 | $80 |
| GPT-5 mini | $0.50 | $2 |
| GPT-5 nano | $0.10 | $0.40 |
| Gemini 2.5 Pro | $1.25 | $10 |
| Gemini 2.5 Flash | $0.30 | $2.50 |
With caching, divide input cost by ~10. See LLM Prompt Caching .
For volume cost optimization see LLM Cost Optimization and LLM Routing .
Latency
Order-of-magnitude p50 for short messages:
- GPT-5 nano: 200–400ms.
- Haiku 4.5: 250–500ms.
- Gemini 2.5 Flash: 300–600ms.
- Sonnet 4.6 / GPT-5 mini: 500–1000ms.
- Opus 4.7 / GPT-5: 1–3s.
Streaming TTFT (time to first token) is what users feel.
Ecosystem
| OpenAI | Anthropic | ||
|---|---|---|---|
| SDKs | Excellent across languages | Excellent | Good |
| Tool use | Mature | Mature, tight | Mature |
| Vision | Excellent | Excellent | Excellent |
| Realtime API | Mature | Rolling out | Mature (Gemini Live) |
| Batch API | Yes (50% off) | Yes (50% off) | Yes |
| Caching | Auto | Explicit markers | Explicit |
| Fine-tuning | Yes | Limited | Yes |
| MCP support | Indirect | First-party | Indirect |
For MCP support Anthropic leads.
Don’t pick one — use a gateway
For production, sit behind an AI gateway :
- Fallback when one provider has an outage.
- Route by task (Haiku for classification, Opus for reasoning).
- Cost tracking per feature.
- Prompt caching across providers.
Single-provider apps are fragile.
What I’d ship today
For a new AI product:
- Anthropic as primary (Sonnet for default, Opus for hard, Haiku for cheap).
- OpenAI as fallback via LiteLLM .
- Gemini when long context or video matters.
- Self-hosted Llama for privacy / cost-driven workloads.
Run evals quarterly; rebalance as models shift.
Read this next
- Anthropic Claude API + Tool Use Guide
- LLM Cost Optimization in 2026
- LLM Routing in 2026
- LLM Evaluations
If you want my multi-provider eval harness comparing Claude / GPT / Gemini on your data, it’s at rajpoot.dev .
Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .