by NVIDIA· 1 years ago
NVIDIA's large open-source model trained for synthetic data generation.
Context Window
4K
Max Output
4K
TTFT
500ms
Speed
45 tok/s
Input Price
$1.20/M tokens
Output Price
$1.20/M tokens
Performance Profile
Frontier-tier performance at $1.20/M input tokens
Fully open source — self-host, fine-tune, and customize without restrictions
340B parameter architecture for deep reasoning
Consistently scores 80%+ across major benchmarks
vs similar-tier models
| Model | Input | Output | Context | Avg Score |
|---|---|---|---|---|
Nemotron-4 340BCurrent NVIDIA | $1.20 | $1.20 | 4K | 83.4 |
GPT-4o OpenAI | $2.50 | $10.00 | 128K | 81.1 |
Kimi K2.5 Moonshot AI | $0.45 | $2.20 | 256K | 92.3 |
Summarize an email
<$0.001~300 word email → short summary
400 in · 100 out
Analyze a 1,000-word article
$0.0022Blog post or news article → detailed analysis
1,333 in · 500 out
Chatbot conversation (10 turns)
$0.0072Full customer support interaction
4,000 in · 2,000 out
Summarize a 50-page report
$0.047Legal contract or research paper → key points
37,500 in · 2,000 out
Review a 5,000-line codebase
$0.034Full code review with suggestions
25,000 in · 3,000 out
Process a full novel
$0.150~90,000 words → detailed summary & analysis
120,000 in · 5,000 out
Email summaries
$18/mo
$0.60/day
Chat conversations
$216/mo
$7/day
Document analysis
$1422/mo
$47/day
No ratings yet. Be the first to rate this model!
Sign in to rate this model and share your experience.
Sign in to leave a comment and join the discussion.
NVIDIA
NVIDIA's optimized Llama 3.1 variant with custom reward model training.
Input
$0.18/M
Output
$0.18/M
Context
128K
NVIDIA
Input
Free/M
Output
Free/M
NVIDIA
Hybrid Mamba-Transformer MoE with 4x higher throughput than predecessor. Open weights and training data.
Input
$0.04/M
Output
$0.08/M
Context
1.0M
OpenAI
OpenAI's most advanced multimodal model. Excels at text, vision, and audio tasks with fast response times.
Input
$2.50/M
Output
$10.00/M
Context
128K
Moonshot AI
Moonshot AI's frontier multimodal MoE model with 1T total parameters (32B active). Tops SWE-bench and AIME 2025 benchmarks.
Input
$0.45/M
Output
$2.20/M
Context
256K
Google's most capable thinking model with breakthrough performance on reasoning and coding.
Input
$1.25/M
Output
$10.00/M
Context
1.0M