Name: DeepSeek-R1-Distill-Llama-8B
Price: 0.07 USD
Author: DeepSeek

Why Choose DeepSeek-R1-Distill-Llama-8B

Budget-friendly at just $0.07/M input tokens

128K token context window — handles lengthy documents with ease

Fully open source — self-host, fine-tune, and customize without restrictions

8B parameter architecture for deep reasoning

Strengths & Limitations

Strengths

+Solid benchmark performance
+Large context window for complex tasks
+Very affordable pricing
+Open source — can self-host and fine-tune

Limitations

−Text only — no image or audio support

Benchmark Results

MMLU73.0

GPQA44.0

HumanEval74.0

HellaSwag78.0

MATH80.0

GSM8K87.0

Quick Comparison

vs similar-tier models

Model	Input	Output	Context	Avg Score
DeepSeek-R1-Distill-Llama-8BCurrent DeepSeek	$0.07	$0.14	128K	72.7
Claude Haiku 3.5 Anthropic	$0.80	$4.00	200K	77.0
Mistral Small Mistral AI	$0.10	$0.30	32K	69.8

Full Comparison

Pricing Calculator

How pricing works A token is roughly ¾ of a word. A 1,000-word article is about 1,333 tokens. You pay separately for input (what you send) and output (what the model replies).

Summarize an email

<$0.001

~300 word email → short summary

400 in · 100 out

Analyze a 1,000-word article

<$0.001

Blog post or news article → detailed analysis

1,333 in · 500 out

Chatbot conversation (10 turns)

<$0.001

Full customer support interaction

4,000 in · 2,000 out

Summarize a 50-page report

$0.0029

Legal contract or research paper → key points

37,500 in · 2,000 out

Review a 5,000-line codebase

$0.0022

Full code review with suggestions

25,000 in · 3,000 out

Process a full novel

$0.0091

~90,000 words → detailed summary & analysis

120,000 in · 5,000 out

At scale: 1,000 requests/day

Email summaries

$1/mo

$0.04/day

Chat conversations

$17/mo

$0.56/day

Document analysis

$87/mo

$3/day

Technical Specifications

ProviderDeepSeek

ArchitectureDense Transformer

Parameters8B

Context Window128K tokens

Modalitiestext

Open SourceYes

Release DateJanuary 20, 2025

Community Ratings

No ratings yet. Be the first to rate this model!

Rate This Model

Sign in to rate this model and share your experience.

Comments

0 comments

Sign in to leave a comment and join the discussion.

No comments yet. Be the first to share your thoughts!

More from DeepSeek

DeepSeek-V3.1

DeepSeek

frontier

Hybrid model combining V3 and R1 strengths. Improved reasoning with RL techniques from R1.

text

Input

$0.27/M

Output

$1.10/M

Context

128K

DeepSeek-V3

DeepSeek

mid

DeepSeek's open-source MoE model rivaling frontier models at a fraction of the cost.

textcode

Input

$0.27/M

Output

$1.10/M

Context

128K

DeepSeek-R1

DeepSeek

mid

DeepSeek's reasoning model with transparent chain-of-thought. Open-source and highly competitive.

text

Input

$0.55/M

Output

$2.19/M

Context

128K

Similar Budget Models

Claude Haiku 3.5

Anthropic

budget

Anthropic's fastest and most affordable model. Great for high-volume, low-latency tasks.

textimage

Input

$0.80/M

Output

$4.00/M

Context

200K

Mistral Small

Mistral AI

budget

Mistral's efficient model for everyday tasks. Fast and cost-effective.

text

Input

$0.10/M

Output

$0.30/M

Context

32K

GPT-4.1 Mini

OpenAI

budget

A fast, affordable variant of GPT-4.1 for high-volume workloads.

textimage

Input

$0.40/M

Output

$1.60/M

Context

1.0M

DeepSeek-R1-Distill-Llama-8B

Why Choose DeepSeek-R1-Distill-Llama-8B

Strengths & Limitations

Strengths

Limitations

Benchmark Results

Quick Comparison

Quick Comparison

Pricing Calculator

At scale: 1,000 requests/day

Technical Specifications

Community Ratings

Rate This Model

Comments

More from DeepSeek

DeepSeek-V3.1

DeepSeek-V3

DeepSeek-R1

Similar Budget Models

Claude Haiku 3.5

Mistral Small

GPT-4.1 Mini

Compare DeepSeek-R1-Distill-Llama-8B with other models