Gemini 2.5 Pro

Name: Gemini 2.5 Pro
Price: 1.25 USD
Author: Google

frontier

by Google· 1 years ago

Google's most capable thinking model with breakthrough performance on reasoning and coding.

Context Window

1.0M

Max Output

66K

TTFT

600ms

Speed

85 tok/s

textimageaudiovideocode

Input Price

$1.25/M tokens

Output Price

$10.00/M tokens

Performance Profile

Why Choose Gemini 2.5 Pro

Frontier-tier performance at $1.25/M input tokens

Massive 1.0M token context window for entire codebases and long documents

Supports text + image + audio + video + code — true multimodal capability

Consistently scores 80%+ across major benchmarks

Best Use Cases

Video Understanding

Analyze and reason about video content with native multimodal capabilities.

Complex Reasoning

Solve multi-step problems with built-in thinking and chain-of-thought.

Code Generation

Top-tier coding with excellent understanding of complex systems.

Strengths & Limitations

Strengths

+Top-tier benchmark scores across categories
+Excellent math performance
+Excellent knowledge performance
+Excellent multilingual performance

Limitations

−Closed source — API access only

Benchmark Results

MMLU90.2

GPQA72.0

MGSM94.0

HumanEval93.0

HellaSwag95.5

MATH91.8

MMMLU89.5

SWE-bench63.8

GSM8K97.5

ARC-Challenge97.0

Quick Comparison

vs similar-tier models

Model	Input	Output	Context	Avg Score
Gemini 2.5 ProCurrent Google	$1.25	$10.00	1.0M	88.4
GPT-4o OpenAI	$2.50	$10.00	128K	81.1
Kimi K2.5 Moonshot AI	$0.45	$2.20	256K	92.3

Full Comparison

Pricing Calculator

How pricing works A token is roughly ¾ of a word. A 1,000-word article is about 1,333 tokens. You pay separately for input (what you send) and output (what the model replies).

Describe a single image

$0.0033

Photo → detailed description

1,000 in · 200 out

Analyze a chart or diagram

$0.0075

Visual data → structured insights

2,000 in · 500 out

OCR a 10-page document

$0.049

Scanned pages → structured text

15,000 in · 3,000 out

Batch process 100 images

$0.325

Bulk image analysis pipeline

100,000 in · 20,000 out

At scale: 1,000 requests/day

Image descriptions

$98/mo

$3/day

Document OCR

$1463/mo

$49/day

Batch image analysis

$9750/mo

$325/day

Technical Specifications

ProviderGoogle

ArchitectureTransformer (MoE) + Thinking

Context Window1.0M tokens

Max Output66K tokens

Modalitiestext, image, audio, video, code

Open SourceNo

Release DateMarch 25, 2025

Community Ratings

No ratings yet. Be the first to rate this model!

Rate This Model

Comments

0 comments

No comments yet. Be the first to share your thoughts!

More from Google

Gemini 2.0 Flash

Google

mid

Google's fastest multimodal model with native tool use and advanced agentic capabilities.

textimageaudiovideo

Input

$0.10/M

Output

$0.40/M

Context

1.0M

Gemini 2.5 Flash

Google

mid

Google's fast and cost-efficient thinking model with strong reasoning capabilities.

textimageaudiovideo

Input

$0.15/M

Output

$0.60/M

Context

1.0M

Veo 3

Google

frontier

Google DeepMind's flagship video generation model that natively produces joint audio-visual output in a single pass. Veo 3 leverages a Latent Diffusion Transformer to generate high-fidelity clips with synchronized dialogue, sound effects, and ambient audio without requiring a separate audio model. It demonstrates strong physical understanding and prompt adherence across diverse cinematic styles.

videoaudio

Input

$5.00/M

Output

$150.00/M