Compare Models

Select up to 4 models to compare benchmarks, pricing, and capabilities side by side.

GPT-4o

OpenAI

Kimi K2.5

Moonshot AI

Llama 3.1 Nemotron 70B

NVIDIA

Add Model

MMLU

GPT-4o

88.7

Kimi K2.5

92.0

Llama 3.1 Nemotron 70B

86.0

HumanEval

GPT-4o

90.2

Kimi K2.5

99.0

Llama 3.1 Nemotron 70B

85.0

GSM8K

GPT-4o

95.8

Kimi K2.5

99.0

Llama 3.1 Nemotron 70B

94.0

GPQA

GPT-4o

53.6

Kimi K2.5

87.6

Llama 3.1 Nemotron 70B

50.0

MGSM

GPT-4o

90.5

Kimi K2.5

96.0

Llama 3.1 Nemotron 70B

0.0

ARC-Challenge

GPT-4o

96.7

Kimi K2.5

0.0

Llama 3.1 Nemotron 70B

95.0

HellaSwag

GPT-4o

95.3

Kimi K2.5

0.0

Llama 3.1 Nemotron 70B

89.0

MATH

GPT-4o

76.6

Kimi K2.5

98.0

Llama 3.1 Nemotron 70B

0.0

SWE-bench

GPT-4o

38.4

Kimi K2.5

76.8

Llama 3.1 Nemotron 70B

0.0

MMMLU

GPT-4o

85.1

Kimi K2.5

0.0

Llama 3.1 Nemotron 70B

0.0

LiveCodeBench

GPT-4o

0.0

Kimi K2.5

85.0

Llama 3.1 Nemotron 70B

0.0

IFEval

GPT-4o

0.0

Kimi K2.5

94.0

Llama 3.1 Nemotron 70B

92.0

AIME 2025

GPT-4o

0.0

Kimi K2.5

96.1

Llama 3.1 Nemotron 70B

0.0

Model	Input	Output	Blended*
GPT-4o	$2.50	$10.00	$6.25
Kimi K2.5	$0.45	$2.20	$1.33
Llama 3.1 Nemotron 70B	$0.35	$1.05	$0.70

*Blended = average of input and output price

Spec	GPT-4o	Kimi K2.5	Llama 3.1 Nemotron 70B
Context Window	128K	256K	128K
Max Output	16K	16K	N/A
TTFT	320ms	500ms	N/A
Speed	95 tok/s	70 tok/s	N/A
Parameters	~1.8T (estimated)	1T (32B active)	70B
Architecture	Transformer (MoE)	MoE + Multimodal	Dense Transformer
Open Source	No	No	Yes
Tier	frontier	frontier	frontier

Quick Verdict

Best Performance

Kimi K2.5

Best Value

Llama 3.1 Nemotron 70B

Fastest

GPT-4o

Compare Models

Benchmarks

Pricing (per 1M tokens)

Technical Specs

Quick Verdict