GPTCrunch

Compare Models

Select up to 4 models to compare benchmarks, pricing, and capabilities side by side.

OpenAI logoo3-mini

OpenAI

DeepSeek logoDeepSeek-R1

DeepSeek

Alibaba/Qwen logoQwen2.5-Coder 32B

Alibaba/Qwen

Add Model
MMLU
o3-mini
86.9
DeepSeek-R1
90.8
Qwen2.5-Coder 32B
0.0
HumanEval
o3-mini
92.9
DeepSeek-R1
92.8
Qwen2.5-Coder 32B
92.7
GSM8K
o3-mini
97.9
DeepSeek-R1
97.3
Qwen2.5-Coder 32B
88.0
GPQA
o3-mini
77.0
DeepSeek-R1
71.5
Qwen2.5-Coder 32B
0.0
MGSM
o3-mini
89.5
DeepSeek-R1
92.8
Qwen2.5-Coder 32B
0.0
ARC-Challenge
o3-mini
96.0
DeepSeek-R1
97.2
Qwen2.5-Coder 32B
0.0
HellaSwag
o3-mini
92.5
DeepSeek-R1
93.8
Qwen2.5-Coder 32B
0.0
MATH
o3-mini
97.0
DeepSeek-R1
97.3
Qwen2.5-Coder 32B
0.0
SWE-bench
o3-mini
49.3
DeepSeek-R1
49.2
Qwen2.5-Coder 32B
35.0
MMMLU
o3-mini
83.5
DeepSeek-R1
87.5
Qwen2.5-Coder 32B
0.0
LiveCodeBench
o3-mini
0.0
DeepSeek-R1
0.0
Qwen2.5-Coder 32B
52.0
ModelInputOutputBlended*
o3-mini
$1.10$4.40$2.75
DeepSeek-R1
$0.55$2.19$1.37
Qwen2.5-Coder 32B
$0.08$0.08$0.08

*Blended = average of input and output price

Spec
o3-mini
DeepSeek-R1
Qwen2.5-Coder 32B
Context Window200K128K128K
Max Output100K8K8K
TTFT800ms900ms180ms
Speed75 tok/s60 tok/s120 tok/s
ParametersN/A685B (37B active)32B
ArchitectureTransformer + CoTTransformer (MoE) + CoTTransformer
Open SourceNoYesYes
Tiermidmidmid

Quick Verdict

Best Performance

DeepSeek-R1

Best Value

Qwen2.5-Coder 32B

Fastest

Qwen2.5-Coder 32B