DeepSeek

DeepSeek's 1T parameter coding-focused model with 1M+ context. Three architectural innovations: Manifold-Constrained Hyper-Connections, Engram memory, Sparse Attention.

Input

$0.10/M

Output

$0.40/M

Context

1.0M

DeepSeek-V3.2

DeepSeek

Unified reasoning and non-reasoning model that merges DeepSeek-V3 and R1 capabilities into a single architecture.

Input

$0.28/M

Output

$0.42/M

Context

128K

DeepSeek-Math V2

DeepSeek

Math-specialized model achieving gold-level scores in math competitions. Based on V3.2 architecture.

Input

$0.27/M

Output

$1.10/M

Context

128K

DeepSeek-V3.1

DeepSeek

Hybrid model combining V3 and R1 strengths. Improved reasoning with RL techniques from R1.

Input

$0.27/M

Output

$1.10/M

Context

128K

DeepSeek-R1-Distill-Llama-70B

DeepSeek

R1's reasoning capability distilled into a Llama 3.1 70B architecture for efficient deployment.

Input

$0.18/M

Output

$0.18/M

Context

128K

DeepSeek-R1

DeepSeek

DeepSeek's reasoning model with transparent chain-of-thought. Open-source and highly competitive.

Input

$0.55/M

Output

$2.19/M

Context

128K

DeepSeek-R1-Distill-Qwen-32B

DeepSeek

R1 reasoning capabilities distilled into a compact Qwen-based 32B model.

Input

$0.12/M

Output

$0.18/M

Context

128K

DeepSeek-R1-Distill-Qwen-7B

DeepSeek

budget

Distilled R1 reasoning into compact Qwen-based model. Exceptional at math and programming.

Input

$0.07/M

Output

$0.14/M

Context

128K

DeepSeek-R1-Distill-Llama-8B

DeepSeek

budget

R1 reasoning distilled into Llama 3 architecture. Strong reasoning at minimal compute cost.

Input

$0.07/M

Output

$0.14/M

Context

128K

DeepSeek-V3

DeepSeek

DeepSeek's open-source MoE model rivaling frontier models at a fraction of the cost.

Input

$0.27/M

Output

$1.10/M

Context

128K

DeepSeek-VL2

DeepSeek

Vision-language model for image understanding, OCR, and visual reasoning tasks.

textimage

Input

$0.14/M

Output

$0.28/M

Context

128K

DeepSeek-V2.5

DeepSeek

Merged general and coder capabilities from V2 into a unified model.

Input

$0.14/M

Output

$0.28/M

Context

128K

DeepSeek-Coder-V2

DeepSeek