Model Directory

Google's most capable thinking model with breakthrough performance on reasoning and coding.

Input

$1.25/M

Output

$10.00/M

Context

1.0M

Gemini 3 Pro

Google

Most powerful Gemini model with native multimodal understanding. Supports adjustable reasoning depth via thinking_level parameter.

Input

$3.50/M

Output

$10.50/M

Context

1.0M

Qwen3-Coder 480B

Alibaba/Qwen

Specialized code model trained on 7.5T tokens (70% code). Supports 100+ programming languages and agentic workflows.

Input

$0.30/M

Output

$0.60/M

Context

262K

DeepSeek-V3

DeepSeek

DeepSeek's open-source MoE model rivaling frontier models at a fraction of the cost.

Input

$0.27/M

Output

$1.10/M

Context

128K

Qwen3.5 397B

Alibaba/Qwen

Alibaba's open-weight hybrid MoE model with 512 experts and 17B active parameters. Natively multimodal with 201 language support. Top scores on GPQA and SWE-bench.

textimagevideocode

Input

$0.15/M

Output

$1.00/M

Context

256K

GPT-5.2-Codex

OpenAI

OpenAI's agentic coding model with context compaction and long-horizon task completion. First model in the GPT-5 Codex series.

Input

$1.75/M

Output

$14.00/M

Context

400K

GPT-5.3-Codex

OpenAI

OpenAI's most capable coding model combining Codex and GPT-5 training stacks. Agentic coding, research, and tool use with 77.3% on Terminal-Bench 2.0.

Input

$2.00/M

Output

$16.00/M

Context

400K

Claude Opus 4.6

Anthropic

Anthropic's strongest reasoning and coding model. 80.8% on SWE-bench Verified, 1M context (beta), agent teams, and extended thinking.

Input

$5.00/M

Output

$25.00/M

Context

1.0M

Claude Sonnet 4.6

Anthropic

Matches Opus 4.6 on most benchmarks at 1/5 the cost. 79.6% on SWE-bench, 1M context, computer use, and design capabilities.

Input

$3.00/M

Output

$15.00/M

Context

1.0M

Gemini 3 Flash

Google

Google's frontier-class model at Flash-level latency and cost. 90.4% on GPQA Diamond, 78% on SWE-bench, 1M context window.

Input

$0.50/M

Output

$3.00/M

Context

1.0M

Gemini 3.1 Pro

Google

Google's most capable model. 94.3% on GPQA Diamond, 80.6% on SWE-bench, 77.1% on ARC-AGI-2. #1 on 12 of 18 tracked benchmarks.

Input

$2.00/M

Output

$12.00/M

Context

1.0M

DeepSeek V4

DeepSeek

DeepSeek's 1T parameter coding-focused model with 1M+ context. Three architectural innovations: Manifold-Constrained Hyper-Connections, Engram memory, Sparse Attention.

Input

$0.10/M

Output

$0.40/M

Context

1.0M

MiniMax M2.5

MiniMax

Achieves 80.2% on SWE-Bench Verified matching Opus 4.6 at 1/20th cost. First on Multi-SWE-Bench at 51.3%.

Input

$0.25/M

Output

$0.75/M

Context

128K

Kimi K2.5

Moonshot AI

Moonshot AI's frontier multimodal MoE model with 1T total parameters (32B active). Tops SWE-bench and AIME 2025 benchmarks.

Input

$0.45/M

Output

$2.20/M

Context

256K

Qwen2.5 7B

Alibaba/Qwen

Compact open-source model for edge deployment and fine-tuning.

Input

$0.03/M

Output

$0.03/M

Context

32K

Qwen2.5-Coder 32B

Alibaba/Qwen

Alibaba's open-source coding specialist. Matches GPT-4o on code benchmarks.

Input

$0.08/M

Output

$0.08/M

Context

128K

Qwen2.5-Coder 7B

Alibaba/Qwen

Compact open-source coding model with impressive code generation capabilities.

Input

$0.03/M

Output

$0.03/M

Context

128K

Qwen3 235B

Alibaba/Qwen

Alibaba's large-scale open-source MoE model with thinking mode support.

Input

$0.20/M

Output

$0.60/M

Context

128K

InternLM2.5 20B

Shanghai AI Lab

Open-source model with 1M context from Shanghai AI Lab. Strong coding and math skills.

Input

$0.06/M

Output

$0.06/M

Context

1.0M

StarCoder2 15B

BigCode

Open-source code model from BigCode/HuggingFace trained on The Stack v2.

Input

$0.04/M

Output

$0.04/M

Context

16K

Arctic

Snowflake

Snowflake's open-source enterprise MoE model optimized for SQL and business tasks.

Input

$0.30/M

Output

$0.30/M

Context

DBRX

Databricks

Databricks' open-source MoE model with strong code and reasoning capabilities.

Input

$0.75/M

Output

$0.75/M

Context

32K

GLM-4.7

Zhipu AI

Zhipu AI's latest open-weight MoE model with interleaved thinking and state-of-the-art coding performance.

Input

$0.50/M

Output

$1.50/M

Context

200K

o4-mini

OpenAI

OpenAI's cost-efficient reasoning model with multimodal input, strong math and coding performance at a fraction of o3 pricing.

Input

$1.10/M

Output

$4.40/M

Context

200K

o3-pro

OpenAI

OpenAI's highest-quality reasoning model with extended compute for complex scientific and mathematical problems.

Input

$20.00/M

Output

$80.00/M

Context

200K

Codestral 25.01

Mistral AI

Mistral's specialized code model supporting 80+ languages with 256K context and fill-in-the-middle capability.

Input

$0.30/M

Output

$0.90/M

Context

256K

DeepSeek-V3.2

DeepSeek

Unified reasoning and non-reasoning model that merges DeepSeek-V3 and R1 capabilities into a single architecture.

Input

$0.28/M

Output

$0.42/M

Context

128K

Qwen3 8B

Alibaba/Qwen

Compact 8B model from the Qwen3 family with thinking mode support and strong efficiency for on-device use.

Input

$0.03/M

Output

$0.06/M

Context

131K

Qwen3 30B-A3B

Alibaba/Qwen

Ultra-efficient MoE model with 128 experts and only 3.3B active parameters, ideal for cost-sensitive deployments.

Input

$0.02/M

Output

$0.04/M

Context

131K

StarCoder2 3B

BigCode

Compact code model trained on 4T+ tokens and 600+ languages from The Stack v2.

Input

$0.03/M

Output

$0.06/M

Context

16K

StarCoder2 7B

BigCode

Mid-size code model matching CodeLlama 13B quality at half the parameters.

Input

$0.07/M

Output

$0.14/M

Context

16K

Devstral Small 2

Mistral AI

Coding-specialized model outperforming Qwen 3 Coder Flash despite smaller size.

Input

$0.20/M

Output

$0.60/M

Context

128K

Codestral Mamba

Mistral AI

Code model using Mamba SSM architecture for linear-time inference. Unlimited theoretical context.

Input

$0.10/M

Output

$0.30/M

Context

256K

Granite 3.2 8B

IBM

Updated Granite with enhanced coding and tool-use capabilities for enterprise automation.

Input

$0.10/M

Output

$0.20/M

Context

128K

Granite 3.2 2B

IBM

Small enterprise model with coding support for lightweight automation workflows.

Input

$0.03/M

Output

$0.06/M

Context

128K

Qwen3 32B

Alibaba/Qwen

Alibaba's dense 32B model with dual thinking/non-thinking modes and strong reasoning performance.

Input

$0.08/M

Output

$0.20/M

Context

131K

Kimi K2 Thinking

Moonshot AI

Moonshot AI's reasoning-focused MoE model with chain-of-thought capabilities. 1T total params, 32B active.

Input

$0.47/M

Output

$2.00/M

Context

131K

Qwen3.5 Plus

Alibaba/Qwen

Hosted version of Qwen3.5 397B with 1M context window and adaptive thinking for complex tasks.

Input

$0.40/M

Output

$2.40/M

Context

1.0M

Qwen3 Max Thinking

Alibaba/Qwen

Alibaba's large-scale reasoning model with ~1T parameters and chain-of-thought capabilities.

Input

$1.20/M

Output

$6.00/M

Context

256K

Qwen3 Coder Next

Alibaba/Qwen

Alibaba's efficient code-focused MoE model. 80B total params, 3B active, Apache 2.0 licensed.

Input

$0.12/M

Output

$0.75/M

Context

256K

CodeGemma 7B

Google

Google's open-source code-focused model based on the Gemma architecture.