Compare Models
Select up to 4 models to compare benchmarks, pricing, and capabilities side by side.
OpenAI
Moonshot AI
NVIDIA
Add Model
MMLU
GPT-4o
88.7
Kimi K2.5
92.0
Llama 3.1 Nemotron 70B
86.0
HumanEval
GPT-4o
90.2
Kimi K2.5
99.0
Llama 3.1 Nemotron 70B
85.0
GSM8K
GPT-4o
95.8
Kimi K2.5
99.0
Llama 3.1 Nemotron 70B
94.0
GPQA
GPT-4o
53.6
Kimi K2.5
87.6
Llama 3.1 Nemotron 70B
50.0
MGSM
GPT-4o
90.5
Kimi K2.5
96.0
Llama 3.1 Nemotron 70B
0.0
ARC-Challenge
GPT-4o
96.7
Kimi K2.5
0.0
Llama 3.1 Nemotron 70B
95.0
HellaSwag
GPT-4o
95.3
Kimi K2.5
0.0
Llama 3.1 Nemotron 70B
89.0
MATH
GPT-4o
76.6
Kimi K2.5
98.0
Llama 3.1 Nemotron 70B
0.0
SWE-bench
GPT-4o
38.4
Kimi K2.5
76.8
Llama 3.1 Nemotron 70B
0.0
MMMLU
GPT-4o
85.1
Kimi K2.5
0.0
Llama 3.1 Nemotron 70B
0.0
LiveCodeBench
GPT-4o
0.0
Kimi K2.5
85.0
Llama 3.1 Nemotron 70B
0.0
IFEval
GPT-4o
0.0
Kimi K2.5
94.0
Llama 3.1 Nemotron 70B
92.0
AIME 2025
GPT-4o
0.0
Kimi K2.5
96.1
Llama 3.1 Nemotron 70B
0.0
| Model | Input | Output | Blended* |
|---|---|---|---|
GPT-4o | $2.50 | $10.00 | $6.25 |
Kimi K2.5 | $0.45 | $2.20 | $1.33 |
Llama 3.1 Nemotron 70B | $0.35 | $1.05 | $0.70 |
*Blended = average of input and output price
| Spec | GPT-4o | Kimi K2.5 | Llama 3.1 Nemotron 70B |
|---|---|---|---|
| Context Window | 128K | 256K | 128K |
| Max Output | 16K | 16K | N/A |
| TTFT | 320ms | 500ms | N/A |
| Speed | 95 tok/s | 70 tok/s | N/A |
| Parameters | ~1.8T (estimated) | 1T (32B active) | 70B |
| Architecture | Transformer (MoE) | MoE + Multimodal | Dense Transformer |
| Open Source | No | No | Yes |
| Tier | frontier | frontier | frontier |
Quick Verdict
Best Performance
Kimi K2.5
Best Value
Llama 3.1 Nemotron 70B
Fastest
GPT-4o