| GPT-5.5 |
OpenAI |
Closed frontier |
Live web data, 1M context, premium coding and reasoning |
Highest cost tier; latency for web calls |
API access
|
| GPT-5.5-Pro |
OpenAI |
Closed highest-intelligence |
Hardest reasoning problems, research-grade analysis |
Higher latency and cost; overkill for simple tasks |
API access (new)
|
| GPT-5.4 |
OpenAI |
Closed balanced |
1M context, strong coding at lower cost than 5.5 |
Not ideal for the hardest reasoning chains |
API access
|
| GPT-5.4 mini |
OpenAI |
Closed efficient |
400K context; coding, computer use, subagents at low cost |
Smaller context than full 5.4; less depth on hardest tasks |
API access (new)
|
| Claude Opus 4.8 |
Anthropic |
Closed frontier |
Most capable model for complex reasoning and agentic coding;
1M context, adaptive thinking, 128K output
|
Premium cost ($5/$25 per MTok); moderate latency |
API access (new)
|
| Claude Opus 4.7 |
Anthropic |
Closed (legacy) |
Strong agentic tasks and reasoning; 1M context with adaptive
thinking
|
Now legacy — consider migrating to Opus 4.8 |
API access
|
| Claude Opus 4.6 |
Anthropic |
Closed (legacy) |
Extended thinking, 1M context, strong reasoning
|
Legacy; same pricing as 4.8 but less capable |
API access
|
| Claude Sonnet 4.6 |
Anthropic |
Closed balanced |
Best speed/intelligence combo; 1M context, extended thinking,
$3/$15 per MTok
|
Not as capable as Opus on hardest tasks |
API access
|
| Claude Haiku 4.5 |
Anthropic |
Closed small |
Fastest model with near-frontier intelligence; 200K context,
$1/$5 per MTok
|
Less robust on deepest tasks; smaller context |
API access
|
| Gemini 3.5 Flash |
Google |
Closed fast (stable) |
Most intelligent for agentic and coding tasks at speed |
Lower quality than Pro tier on complex reasoning |
API access (stable)
|
| Gemini 3.1 Pro |
Google |
Closed enterprise (preview) |
Advanced intelligence, complex problem-solving, agentic coding |
Still in preview; higher latency |
Preview access
|
| Gemini 3 Flash |
Google |
Closed (preview) |
Frontier-class performance at fraction of cost |
Preview status; still maturing |
Preview access
|
| Gemini 2.5 Pro |
Google |
Closed |
Reasoning and multimodal enterprise apps |
Being superseded by 3.x series |
API access
|
| Gemma 4 E2B/E4B |
Google |
Open weight mobile/edge |
Optimized for mobile and IoT; compute-efficient inference
|
Limited ceiling on complex reasoning; requires edge hardware
|
Model docs
·
Weights
|
| Gemma 4 26B |
Google |
Open weight balanced |
Advanced reasoning on personal hardware; agentic workflows
|
Needs good VRAM; careful quantization for consumer GPUs
|
Model docs
·
Weights
|
| Gemma 4 31B |
Google |
Open weight high quality |
Most capable Gemma for private reasoning and coding
|
Hardware intensive; best with 24GB+ VRAM or multi-GPU
|
Model docs
·
Weights
|
| Llama 4 Maverick |
Meta |
Open MoE multimodal |
17B params, 128 experts (402B total); flagship open-weight reasoning and vision |
Full MoE serving requires strong infrastructure |
Download
·
Scout
|
| Llama 4 Scout |
Meta |
Open multimodal efficient |
17B params, 16 experts (109B total); edge inference with
vision-text support
|
Lower ceiling than Maverick; best for volume-optimized
deployments
|
Download
|
| Llama 3.1 405B Instruct |
Meta |
Open weight |
Top-end open deployment quality |
Heavy infrastructure requirements |
Download
·
70B
·
8B
|
| Llama 3.1 70B Instruct |
Meta |
Open weight |
Strong self-hosted quality/cost balance |
Needs good inference stack |
Download
·
405B
·
8B
|
| Llama 3.1 8B Instruct |
Meta |
Open weight small |
Edge and low-cost deployments |
Lower performance on complex tasks |
Download
·
70B
·
405B
|
| Llama 3.2 11B Vision |
Meta |
Open multimodal |
Private vision-text pipelines |
Requires evals for OCR-heavy cases |
Download
·
90B
|
| Llama 3.2 90B Vision |
Meta |
Open multimodal |
High-capacity multimodal inference |
Infrastructure complexity |
Download
·
11B
|
| Llama 3.3 70B Instruct |
Meta |
Open weight |
Efficient self-hosted quality, matches 3.1 405B at much lower
cost
|
Needs good inference stack for throughput |
Download
|
| Mistral Large 3 |
Mistral AI |
Open weight multimodal |
Advanced general-purpose with open weights available |
Smaller ecosystem vs hyperscalers |
API + weights
|
| Mistral Medium 3.5 |
Mistral AI |
Closed frontier |
Frontier-class multimodal for agentic and coding tasks |
Higher cost than smaller variants |
API access
|
| Mistral Small 4 |
Mistral AI |
Closed small |
Unified instruction-following, reasoning, and coding |
Limited depth on advanced reasoning |
API access
|
| Ministral 3 14B |
Mistral AI |
Open small |
Best-in-class text and vision at compact size |
Can trail latest closed models |
API access
|
| Ministral 3 8B |
Mistral AI |
Open compact |
Efficient text and vision on consumer hardware |
Lower ceiling than 14B on complex tasks |
API access
|
| Ministral 3 3B |
Mistral AI |
Open tiny |
Ultra-compact with multimodal support for edge/mobile |
Very limited on complex reasoning |
API access
|
| Devstral 2 |
Mistral AI |
Code-specialized open |
Software engineering and code review |
Narrower general language strength |
API access
|
| Qwen3.5-35B-A3B |
Alibaba |
Open weight MoE multimodal |
MoE (35B total, 3B active); hybrid thinking mode;
262K native context; multimodal (text, image, video);
agentic coding. SWE-bench: 73.4
|
Regional compliance review required; thinking mode adds
latency for simple tasks
|
Download
·
FP8
|
| Qwen3.5-27B |
Alibaba |
Open weight balanced |
Strong reasoning at moderate size; 80K context |
Regional compliance review; needs good VRAM |
Model hub
|
| Qwen3.5-9B |
Alibaba |
Open weight compact |
64K context; efficient private inference |
Lower ceiling than larger Qwen3.5 variants |
Model hub
|
| Qwen3 32B Instruct |
Alibaba |
Open weight |
Strong open-weight multilingual assistant quality |
Regional compliance and policy review required |
Model hub
|
| DeepSeek V4-Pro |
DeepSeek |
Open/available flagship |
862B parameters; strong general reasoning and coding |
Massive model requires substantial infrastructure |
Download (new)
|
| DeepSeek V4-Flash |
DeepSeek |
Open/available efficient |
158B parameters; fast general reasoning at lower cost |
Governance review in enterprise; less capable than V4-Pro |
Download
|
| DeepSeek V3.2 |
DeepSeek |
Open/available |
685B parameters; mature and well-tested |
Superseded by V4 series; large infrastructure needed |
Download
|