Model Directory

Explore 249+ AI models from 49 providers. Filter by capability, tier, and pricing to find the right model.

249 Models49 Providers

Showing all 249 models

Kimi K2.5

Moonshot AI

Moonshot AI's frontier multimodal MoE model with 1T total parameters (32B active). Tops SWE-bench and AIME 2025 benchmarks.

textimagecode

Input

$0.45/M

Output

$2.20/M

Context

256K

Llama 3.3 70B

Meta

mid

Meta's open-source model matching GPT-4 class performance at 70B parameters.

text

Input

$0.18/M

Output

$0.18/M

Context

128K

Sora 2

OpenAI

frontier

OpenAI's second-generation video synthesis model capable of producing cinematic-quality videos up to 60 seconds long with synchronized audio. Built on an advanced Diffusion Transformer (DiT) architecture, Sora 2 excels at complex scene composition, realistic physics simulation, and coherent multi-character narratives with natural dialogue and ambient sound.

videoaudio

Input

$5.00/M

Output

$100.00/M

GPT-5

OpenAI

frontier

OpenAI's flagship model replacing GPT-4o and o3. Achieves 94.6% on AIME 2025 and 74.9% on SWE-bench. Multimodal with thinking capabilities.

textimageaudio

Input

$5.00/M

Output

$15.00/M

Context

197K

GPT-5.2

OpenAI

frontier

Expanded GPT-5 with 400K context and 128K max output. Near-perfect 100% on AIME 2025 math benchmark.

textimageaudio

Input

$8.00/M

Output

$24.00/M

Context

400K

Veo 3

Google

frontier

Google DeepMind's flagship video generation model that natively produces joint audio-visual output in a single pass. Veo 3 leverages a Latent Diffusion Transformer to generate high-fidelity clips with synchronized dialogue, sound effects, and ambient audio without requiring a separate audio model. It demonstrates strong physical understanding and prompt adherence across diverse cinematic styles.

videoaudio

Input

$5.00/M

Output

$150.00/M

GPT-5.2-Codex

OpenAI

frontier

OpenAI's agentic coding model with context compaction and long-horizon task completion. First model in the GPT-5 Codex series.

textcode

Input

$1.75/M

Output

$14.00/M

Context

400K

GPT-5.3-Codex

OpenAI

frontier

OpenAI's most capable coding model combining Codex and GPT-5 training stacks. Agentic coding, research, and tool use with 77.3% on Terminal-Bench 2.0.

textcode

Input

$2.00/M

Output

$16.00/M

Context

400K

Claude Opus 4.6

Anthropic

frontier

Anthropic's strongest reasoning and coding model. 80.8% on SWE-bench Verified, 1M context (beta), agent teams, and extended thinking.

textimagecode

Input

$5.00/M

Output

$25.00/M

Context

1.0M

Claude Sonnet 4.6

Anthropic

frontier

Matches Opus 4.6 on most benchmarks at 1/5 the cost. 79.6% on SWE-bench, 1M context, computer use, and design capabilities.

textimagecode

Input

$3.00/M

Output

$15.00/M

Context

1.0M

Gemini 3 Flash

Google

mid

Google's frontier-class model at Flash-level latency and cost. 90.4% on GPQA Diamond, 78% on SWE-bench, 1M context window.

textimageaudiovideocode

Input

$0.50/M

Output

$3.00/M

Context

1.0M

Gemini 3.1 Pro

Google

frontier

Google's most capable model. 94.3% on GPQA Diamond, 80.6% on SWE-bench, 77.1% on ARC-AGI-2. #1 on 12 of 18 tracked benchmarks.

textimageaudiovideocode

Input

$2.00/M

Output

$12.00/M

Context

1.0M

DeepSeek V4

DeepSeek

frontier

DeepSeek's 1T parameter coding-focused model with 1M+ context. Three architectural innovations: Manifold-Constrained Hyper-Connections, Engram memory, Sparse Attention.

textcode

Input

$0.10/M

Output

$0.40/M

Context

1.0M

Qwen3.5 397B

Alibaba/Qwen

frontier

Alibaba's open-weight hybrid MoE model with 512 experts and 17B active parameters. Natively multimodal with 201 language support. Top scores on GPQA and SWE-bench.

textimagevideocode

Input

$0.15/M

Output

$1.00/M

Context

256K

Llama 4 Scout

Meta

mid

Meta's latest open-source MoE model with 17B active parameters and industry-leading 10M token context.

textimage

Input

$0.15/M

Output

$0.60/M

Context

10.5M

Claude Opus 4.5

Anthropic

frontier

Flagship Opus release with major improvements in coding and workplace productivity tasks. Predecessor to Opus 4.6.

textimage

Input

$15.00/M

Output

$75.00/M

Context

1.0M

Runway Gen-4.5

Runway

frontier

Runway's latest flagship model built on a novel Autoregressive-to-Diffusion (A2D) hybrid architecture that first plans scene structure autoregressively and then renders frames through diffusion. Gen-4.5 achieves state-of-the-art temporal coherence, photorealistic detail, and nuanced control over motion dynamics, lighting, and artistic style.

video

Input

$5.00/M

Output

$130.00/M

Grok-3

xAI

frontier

xAI's frontier model trained on Colossus supercluster. Real-time data access and strong reasoning.

textimage

Input

$3.00/M

Output

$15.00/M

Context

131K

Amazon Nova 2 Pro

Amazon

frontier

Most intelligent Amazon model for complex multi-step reasoning and agentic workflows.

textimagevideoaudio

Input

$4.00/M

Output

$12.00/M

Context

1.0M

Gemini 3 Pro

Google

frontier

Most powerful Gemini model with native multimodal understanding. Supports adjustable reasoning depth via thinking_level parameter.

textimageaudiovideocode

Input

$3.50/M

Output

$10.50/M

Context

1.0M

Kling 3.0

Kuaishou

frontier

The latest entry in Kuaishou's Kling series, introducing multi-shot sequence generation and an AI Director mode that automatically plans camera angles, transitions, and pacing. Kling 3.0 produces broadcast-quality video with native audio synthesis, making it suitable for short-form content creation, advertising, and virtual production workflows.

videoaudio

Input

$3.00/M

Output

$60.00/M

Llama 4 Maverick

Meta

frontier

Meta's powerful open-source MoE model with 400B total params and 1M context window.

textimage

Input

$0.50/M

Output

$2.00/M

Context

1.0M

Claude Opus 4

Anthropic

frontier

Anthropic's most powerful model. Top-tier performance on coding, analysis, and complex reasoning tasks.

textimage

Input

$15.00/M

Output

$75.00/M

Context

200K

Claude Sonnet 4

Anthropic

mid

Anthropic's best balance of intelligence and speed. Excellent for production workloads.

textimage

Input

$3.00/M

Output

$15.00/M

Context

200K

Seedance 2.0

ByteDance

frontier

ByteDance's unified multimodal generation model that handles video, audio, and image synthesis within a single architecture. Seedance 2.0 produces highly coherent audiovisual content with strong temporal consistency, supporting diverse creative workflows from music video generation to product advertisement creation with synchronized narration and effects.

videoaudioimage

Input

$3.00/M

Output

$70.00/M

o1

OpenAI

frontier

OpenAI's reasoning model with chain-of-thought capabilities for complex problem solving.

textimage

Input

$15.00/M

Output

$60.00/M

Context

200K

Gemini 2.0 Flash

Google

mid

Google's fastest multimodal model with native tool use and advanced agentic capabilities.

textimageaudiovideo

Input

$0.10/M

Output

$0.40/M

Context

1.0M

o3

OpenAI

frontier

OpenAI's most powerful reasoning model with breakthrough performance on math and coding benchmarks.

textimage

Input

$10.00/M

Output

$40.00/M

Context

200K

Qwen3-Coder 480B

Alibaba/Qwen

frontier

Specialized code model trained on 7.5T tokens (70% code). Supports 100+ programming languages and agentic workflows.

textcode

Input

$0.30/M

Output

$0.60/M

Context

262K

Qwen3-VL 235B

Alibaba/Qwen

frontier

Most capable open VLM rivaling GPT-5 across multimodal benchmarks. Strong reasoning and agentic capabilities.

textimagevideo

Input

$0.30/M

Output

$0.60/M

Context

128K

MiniMax M2.5

MiniMax

frontier

Achieves 80.2% on SWE-Bench Verified matching Opus 4.6 at 1/20th cost. First on Multi-SWE-Bench at 51.3%.

textcode

Input

$0.25/M

Output

$0.75/M

Context

128K

DeepSeek-V3

DeepSeek

mid

DeepSeek's open-source MoE model rivaling frontier models at a fraction of the cost.

textcode

Input

$0.27/M

Output

$1.10/M

Context

128K

DeepSeek-R1

DeepSeek

mid

DeepSeek's reasoning model with transparent chain-of-thought. Open-source and highly competitive.

text

Input

$0.55/M

Output

$2.19/M

Context

128K

GPT Image 1

OpenAI

frontier

OpenAI's native image generation capability integrated directly into GPT-4o, enabling conversational image creation and iterative editing through natural language. GPT Image 1 excels at accurate text rendering within images, complex multi-element compositions, and faithful adherence to detailed prompts across photorealistic, illustrative, and artistic styles.

image

Input

$10.00/M

Output

$40.00/M

DeepSeek-V3.1

DeepSeek

frontier

Hybrid model combining V3 and R1 strengths. Improved reasoning with RL techniques from R1.

text

Input

$0.27/M

Output

$1.10/M

Context

128K

Gemini 2.5 Flash

Google

mid

Google's fast and cost-efficient thinking model with strong reasoning capabilities.

textimageaudiovideo

Input

$0.15/M

Output

$0.60/M

Context

1.0M

Gemini 2.5 Pro

Google

frontier

Google's most capable thinking model with breakthrough performance on reasoning and coding.

textimageaudiovideocode

Input

$1.25/M

Output

$10.00/M

Context

1.0M

Imagen 4

Google

frontier

Google DeepMind's fourth-generation image synthesis model capable of producing images up to 2K resolution with exceptional photorealism and compositional accuracy. Imagen 4 includes SynthID watermarking by default for responsible AI deployment, supports advanced inpainting and outpainting, and demonstrates industry-leading performance on text rendering and spatial reasoning tasks.

image

Input

$4.00/M

Output

$20.00/M

GPT-4.1

OpenAI

frontier

OpenAI's latest GPT-4 series model with improved coding, instruction following, and long context.

textimage

Input

$2.00/M

Output

$8.00/M

Context

1.0M

FLUX.2 Pro

Black Forest Labs

frontier

Black Forest Labs' flagship commercial image generation model with 32 billion parameters, delivering up to 4-megapixel resolution output with exceptional detail and prompt fidelity. FLUX.2 Pro achieves state-of-the-art results in photorealism, typography rendering, and complex scene composition, making it a top choice for professional creative applications.

image

Input

$3.00/M

Output

$30.00/M

o3-mini

OpenAI

mid

OpenAI's efficient reasoning model, optimized for speed while maintaining strong analytical capabilities.

text

Input

$1.10/M

Output

$4.40/M

Context

200K

Midjourney V7

Midjourney

frontier

Midjourney's seventh major model release featuring 12 billion parameters and expanded multimodal capabilities including short video clip generation alongside its renowned image synthesis. V7 delivers dramatically improved coherence, photorealism, and artistic range, with enhanced understanding of spatial relationships, lighting, and material properties across diverse visual styles.

imagevideo

Input

$5.00/M

Output

$50.00/M

GPT-4o

OpenAI

frontier

OpenAI's most advanced multimodal model. Excels at text, vision, and audio tasks with fast response times.

textimageaudio

Input

$2.50/M

Output

$10.00/M

Context

128K

Qwen2.5 72B

Alibaba/Qwen

mid

Alibaba's flagship open-source model. Competitive with Llama 3.1 405B at a fraction of the size.

textcode

Input

$0.30/M

Output

$0.30/M

Context

128K

Qwen2.5 32B

Alibaba/Qwen

mid

Strong mid-range open-source model from Alibaba with broad capabilities.

textcode

Input

$0.08/M

Output

$0.08/M

Context

128K

Qwen2.5 14B

Alibaba/Qwen

budget

Efficient open-source model balancing capability and speed.

textcode

Input

$0.05/M

Output

$0.05/M

Context

128K

Qwen2.5 7B

Alibaba/Qwen

budget

Compact open-source model for edge deployment and fine-tuning.

textcode

Input

$0.03/M

Output

$0.03/M

Context

32K

Qwen2.5-Coder 32B

Alibaba/Qwen

mid

Alibaba's open-source coding specialist. Matches GPT-4o on code benchmarks.

code

Input

$0.08/M

Output

$0.08/M

Context

128K

Qwen2.5-Coder 7B

Alibaba/Qwen

budget

Compact open-source coding model with impressive code generation capabilities.

code

Input

$0.03/M

Output

$0.03/M

Context

128K

Qwen2-VL 72B

Alibaba/Qwen

mid

Alibaba's open-source vision-language model with video understanding capabilities.

textimagevideo

Input

$0.40/M

Output

$0.40/M

Context

32K

QwQ 32B

Alibaba/Qwen

mid

Alibaba's open-source reasoning model with transparent chain-of-thought. Competitive with o1-mini.

text

Input

$0.10/M

Output

$0.30/M

Context

32K

Qwen3 235B

Alibaba/Qwen

frontier

Alibaba's large-scale open-source MoE model with thinking mode support.

textcode

Input

$0.20/M

Output

$0.60/M

Context

128K

Phi-4

Microsoft

budget

Microsoft's 14B open-source model with training innovations that punch above its weight class.

text

Input

$0.04/M

Output

$0.04/M

Context

16K

Phi-3.5 Mini

Microsoft

budget

Microsoft's compact open-source model with 128K context. Great for on-device inference.

text

Input

$0.01/M

Output

$0.01/M

Context

128K

Phi-3.5 MoE

Microsoft

mid

Microsoft's open-source MoE model with 42B total params and only 6.6B active.

text

Input

$0.06/M

Output

$0.06/M

Context

128K

Phi-3 Medium

Microsoft

budget

Microsoft's 14B open-source model with 128K context and strong reasoning capabilities.

text

Input

$0.04/M

Output

$0.04/M

Context

128K

WizardLM-2 8x22B

Microsoft

mid

Microsoft's instruction-tuned MoE model based on Mixtral. Strong on complex reasoning tasks.

text

Input

$0.65/M

Output

$0.65/M

Context

66K

Nemotron 70B

NVIDIA

mid

NVIDIA's optimized Llama 3.1 variant with custom reward model training.

text

Input

$0.18/M

Output

$0.18/M

Context

128K

Nemotron-4 340B

NVIDIA

frontier

NVIDIA's large open-source model trained for synthetic data generation.

text

Input

$1.20/M

Output

$1.20/M

Context

Jamba 1.5 Large

AI21 Labs

mid

AI21's hybrid SSM-Transformer model with 256K context. Novel Mamba architecture.

text

Input

$2.00/M

Output

$8.00/M

Context

256K

Jamba 1.5 Mini

AI21 Labs

budget

Compact version of Jamba with hybrid SSM-Transformer architecture.

text

Input

$0.20/M

Output

$0.40/M

Context

256K

Falcon 180B

TII

mid

TII's largest open-source model. One of the first truly open 180B parameter models.

text

Input

$0.80/M

Output

$0.80/M

Context

Falcon 2 11B

TII

budget

TII's efficient open-source model with multimodal capabilities.

textimage

Input

$0.04/M

Output

$0.04/M

Context

Yi-Lightning

01.AI

mid

01.AI's fast inference model with strong performance across benchmarks.

text

Input

$0.30/M

Output

$0.30/M

Context

16K

Yi-1.5 34B

01.AI

mid

01.AI's open-source 34B model with strong bilingual (English/Chinese) capabilities.

text

Input

$0.10/M

Output

$0.10/M

Context

Yi-Large

01.AI

frontier

01.AI's frontier closed-source model with top-tier multilingual performance.

text

Input

$3.00/M

Output

$9.00/M

Context

32K

InternLM2.5 20B

Shanghai AI Lab

mid

Open-source model with 1M context from Shanghai AI Lab. Strong coding and math skills.

textcode

Input

$0.06/M

Output

$0.06/M

Context

1.0M

InternVL2 26B

Shanghai AI Lab

mid

Open-source vision-language model with strong image understanding capabilities.

textimage

Input

$0.08/M

Output

$0.08/M

Context

StableLM 2 12B

Stability AI

budget

Stability AI's open-source language model with multilingual support.

text

Input

$0.04/M

Output

$0.04/M

Context

OLMo 2 13B

Allen AI

budget

Fully open-source model from Allen AI with open training data, code, and weights.

text

Input

$0.04/M

Output

$0.04/M

Context

StarCoder2 15B

BigCode

budget

Open-source code model from BigCode/HuggingFace trained on The Stack v2.

code

Input

$0.04/M

Output

$0.04/M

Context

16K

Arctic

Snowflake

mid

Snowflake's open-source enterprise MoE model optimized for SQL and business tasks.

textcode

Input

$0.30/M

Output

$0.30/M

Context

DBRX

Databricks

mid

Databricks' open-source MoE model with strong code and reasoning capabilities.

textcode

Input

$0.75/M

Output

$0.75/M

Context

32K

GLM-4

Zhipu AI

mid

Zhipu AI's flagship model with strong Chinese and English bilingual capabilities.

textimage

Input

$1.00/M

Output

$3.00/M

Context

128K

GLM-4.7

Zhipu AI

frontier

Zhipu AI's latest open-weight MoE model with interleaved thinking and state-of-the-art coding performance.

textcode

Input

$0.50/M

Output

$1.50/M

Context

200K

o4-mini

OpenAI

mid

OpenAI's cost-efficient reasoning model with multimodal input, strong math and coding performance at a fraction of o3 pricing.

textimagecode

Input

$1.10/M

Output

$4.40/M

Context

200K

o3-pro

OpenAI

frontier

OpenAI's highest-quality reasoning model with extended compute for complex scientific and mathematical problems.

textimagecode

Input

$20.00/M

Output

$80.00/M

Context

200K

Gemini 2.0 Flash-Lite

Google

budget

Google's ultra-efficient model offering better performance than Gemini 1.5 Flash at the same cost point.

textimage

Input

$0.07/M

Output

$0.30/M

Context

1.0M

Mistral Large 3

Mistral AI

frontier

Mistral's open-weight 675B MoE model with 41B active parameters, multimodal input, and 256K context.

textimage

Input

$0.50/M

Output

$1.50/M

Context

256K

Mistral Medium 3

Mistral AI

mid

Mistral's mid-tier model offering 90% of Claude Sonnet quality at significantly lower cost.

text

Input

$0.40/M

Output

$2.00/M

Context

131K

Mistral Small 3.1

Mistral AI

budget

Compact 24B model with image understanding, 128K context, and Apache 2.0 license.

textimage

Input

$0.10/M

Output

$0.30/M

Context

128K

Codestral 25.01

Mistral AI

mid

Mistral's specialized code model supporting 80+ languages with 256K context and fill-in-the-middle capability.

code

Input

$0.30/M

Output

$0.90/M

Context

256K

Grok-3 Mini

xAI

mid

xAI's efficient reasoning model with fast inference and competitive performance at budget pricing.

text

Input

$0.30/M

Output

$0.50/M

Context

131K

Phi-4-mini

Microsoft

budget

Microsoft's 3.8B parameter model with 128K context, strong reasoning capability for on-device deployment.

text

Input

$0.01/M

Output

$0.01/M

Context

128K

Phi-4-multimodal

Microsoft

budget

Microsoft's 5.6B compact model unifying text, vision, and speech in a single architecture.

textimageaudio

Input

$0.02/M

Output

$0.02/M

Context

128K

Phi-4-reasoning

Microsoft

mid

Chain-of-thought reasoning variant of Phi-4, competitive with much larger models on math and logic tasks.

text

Input

$0.04/M

Output

$0.04/M

Context

32K

DeepSeek-V3.2

DeepSeek

frontier

Unified reasoning and non-reasoning model that merges DeepSeek-V3 and R1 capabilities into a single architecture.

textcode

Input

$0.28/M

Output

$0.42/M

Context

128K

DeepSeek-R1-Distill-Llama-70B

DeepSeek

mid

R1's reasoning capability distilled into a Llama 3.1 70B architecture for efficient deployment.

text

Input

$0.18/M

Output

$0.18/M

Context

128K

Command A

Cohere

mid

Cohere's 111B parameter model supporting 23 languages with enterprise tool use and 256K context.

text

Input

$2.50/M

Output

$10.00/M

Context

256K

Qwen3 32B

Alibaba/Qwen

mid

Alibaba's dense 32B model with dual thinking/non-thinking modes and strong reasoning performance.

textcode

Input

$0.08/M

Output

$0.20/M

Context

131K

Qwen3 8B

Alibaba/Qwen

budget

Compact 8B model from the Qwen3 family with thinking mode support and strong efficiency for on-device use.

textcode

Input

$0.03/M

Output

$0.06/M

Context

131K

Qwen3 30B-A3B

Alibaba/Qwen

budget

Ultra-efficient MoE model with 128 experts and only 3.3B active parameters, ideal for cost-sensitive deployments.

textcode

Input

$0.02/M

Output

$0.04/M

Context

131K

Grok 4.20

xAI

frontier

xAI's 4-agent parallel collaboration system with rapid learning architecture and medical document analysis. Beta release.

textimage

Input

$3.00/M

Output

$15.00/M

Context

131K

Tiny Aya

Cohere

budget

Cohere's compact multilingual model supporting 70+ languages. Runs on consumer devices including phones. Outperforms Gemma3-4B in 46/61 languages.

text

Input

$0.01/M

Output

$0.01/M

Context

32K

Falcon 3 3B

TII

budget

Compact Falcon model for resource-constrained deployments with strong reasoning.

text

Input

$0.04/M

Output

$0.08/M

Context

32K

Falcon 3 1B

TII

budget

Smallest Falcon model for edge inference and mobile deployment.

text

Input

$0.02/M

Output

$0.04/M

Context

32K

OLMo 3 32B

Allen AI

mid

Fully open model with all components public: data, code, weights, and checkpoints. Instruct, Think, and RL Zero variants.

text

Input

$0.25/M

Output

$0.75/M

Context

128K

OLMo 2 7B

Allen AI

budget

Outperforms Llama 3.1 8B. Everything released: training data, weights, code, recipes, and checkpoints.

text

Input

$0.07/M

Output

$0.14/M

Context

128K

Molmo 72B

Allen AI

frontier

Open multimodal model for visual understanding, image captioning, and visual question answering.

textimage

Input

$0.40/M

Output

$1.20/M

Context

128K

GLM-5

Zhipu AI

frontier

Zhipu's largest text generation model at 754B parameters.

text

Input

$2.00/M

Output

$6.00/M

Context

256K

GLM-4.5V

Zhipu AI

mid

Vision-language MoE model with superior performance at lower inference cost.

textimage

Input

$0.15/M

Output

$0.30/M

Context

128K

CogVideoX 5B

Zhipu AI

mid

Open-source video generation model creating 6-second clips at 720x480. Supports LoRA fine-tuning.

video

Input

$0.05/M

Output

$0.05/M

BGE-VL

BAAI

mid

State-of-the-art multimodal embedding model for visual search applications.

textimage

Input

$0.02/M

Output

$0.02/M

Context

Veo 3.1

Google

frontier

An enhanced iteration of Google DeepMind's Veo series that produces 8-second clips that can be seamlessly extended up to 148 seconds through iterative generation. Veo 3.1 improves temporal consistency over long sequences, delivers higher resolution output, and refines audio synchronization for extended storytelling and commercial content production.

videoaudio

Input

$3.00/M

Output

$80.00/M

Runway Gen-4 Turbo

Runway

frontier

Runway's high-performance video generation model optimized for professional content creation at up to 4K resolution. Gen-4 Turbo maintains consistent characters and environments across shots, supports detailed camera control, and delivers studio-grade output with dramatically reduced generation times compared to its predecessors.

video

Input

$5.00/M

Output

$120.00/M

Kling 2.6

Kuaishou

frontier

Kuaishou's advanced video generation model capable of simultaneous audio-visual synthesis, producing clips with fully synchronized dialogue, music, and environmental sounds. Kling 2.6 excels at generating realistic human motion, facial expressions, and complex multi-object interactions while maintaining strong temporal consistency across extended sequences.

videoaudio

Input

$2.00/M

Output

$40.00/M

Pika 2.5

Pika Labs

mid

Pika Labs' video generation model focused on ultra-realistic output with enhanced physics simulation. Pika 2.5 handles complex material interactions such as fluid dynamics, cloth draping, and particle effects with high fidelity. Its intuitive prompt interface and style controls make it accessible for creators seeking photorealistic short-form video content.

video

Input

$2.00/M

Output

$50.00/M

Luma Ray3.14

Luma AI

mid

Luma AI's video generation model delivering native 1080p output at 4x faster inference speeds than previous versions, with optional 4K upscaling. Ray3.14 specializes in photorealistic 3D-aware video synthesis with strong spatial understanding, making it particularly effective for product visualization, architectural walkthroughs, and immersive content creation.

video

Input

$2.00/M

Output

$45.00/M

MiniMax Hailuo 2.3

MiniMax

mid

MiniMax's Hailuo 2.3 video model combines photorealistic rendering with versatile style support including anime, watercolor, and cinematic looks. It features advanced motion control, accurate lip-sync for dialogue scenes, and sophisticated lighting effects that adapt dynamically to scene content and camera movement.

video

Input

$2.00/M

Output

$50.00/M

LTX-2

Lightricks

mid

Lightricks' open-source video generation model capable of producing native 4K video at 50 frames per second with clips up to 20 seconds in length. LTX-2 includes native audio synthesis and offers full model weights under a permissive license, making it a leading choice for researchers and developers building custom video generation pipelines.

videoaudio

Input

Free/M

Output

Free/M

Wan 2.1

Alibaba/Qwen

mid

Alibaba's open-source video generation model that achieved the number one ranking on the VBench video quality benchmark upon release. With 14 billion parameters, Wan 2.1 demonstrates exceptional prompt adherence, temporal consistency, and visual quality across diverse content types, establishing a new baseline for open-weight video synthesis models.

video

Input

Free/M

Output

Free/M

Wan 2.2

Alibaba/Qwen

mid

The successor to Wan 2.1, this open-source model introduces a Mixture-of-Experts flow-matching architecture with approximately 27 billion total parameters and 14 billion active during inference. Wan 2.2 delivers significantly improved motion quality, fine-grained detail, and extended generation lengths while maintaining the accessibility of fully open weights.

video

Input

Free/M

Output

Free/M

HunyuanVideo 1.5

Tencent

mid

Tencent's open-source video generation model with 8.3 billion parameters, featuring a novel Spatial-Temporal Self-Attention (SSTA) mechanism for improved temporal coherence. HunyuanVideo 1.5 supports diverse aspect ratios, variable frame rates, and extended clip durations, making it a versatile foundation model for the open-source video generation community.

video

Input

Free/M

Output

Free/M

Stable Video 4D 2.0

Stability AI

mid

Stability AI's specialized model for 4D novel-view video synthesis, generating temporally consistent multi-angle video from a single input clip or image. Stable Video 4D 2.0 enables creators to produce orbiting camera paths, bullet-time effects, and 3D-aware video transformations that maintain geometric and photometric coherence throughout the sequence.

video

Input

$2.00/M

Output

$40.00/M

GPT Image 1.5

OpenAI

frontier

An optimized successor to GPT Image 1 that delivers 20% lower cost and 4x faster generation while maintaining equivalent visual quality. GPT Image 1.5 introduces improved batch processing, enhanced style consistency for multi-image projects, and refined detail handling for professional design and marketing workflows.

image

Input

$8.00/M

Output

$32.00/M

GPT Image 1 Mini

OpenAI

budget

A cost-efficient variant of OpenAI's image generation model offering 54-70% lower pricing while retaining strong prompt adherence and visual quality for standard use cases. GPT Image 1 Mini is optimized for high-volume applications such as e-commerce product imagery, social media content, and rapid prototyping where speed and cost matter more than maximum fidelity.

image

Input

$2.50/M

Output

$8.00/M

Gemini 2.5 Flash Image

Google

mid

A multimodal extension of Google's Gemini 2.5 Flash model that adds native image generation and editing capabilities alongside text understanding. This model enables conversational image creation, iterative visual refinement, and combined text-image output within a single unified interface, making it particularly effective for design iteration and creative brainstorming workflows.

imagetext

Input

$0.15/M

Output

$30.00/M

FLUX.2 Dev

Black Forest Labs

mid

The open-weights development version of FLUX.2 with the same 32 billion parameter architecture as the Pro variant, released for non-commercial research and experimentation. FLUX.2 Dev provides researchers full access to model weights for fine-tuning, distillation, and architectural exploration while delivering near-Pro-level quality for academic and personal projects.

image

Input

Free/M

Output

Free/M

Adobe Firefly Image 5

Adobe

frontier

Adobe's latest commercially safe image generation model trained exclusively on licensed and public domain content, delivering photorealistic output at native 4-megapixel resolution. Firefly Image 5 integrates deeply with Adobe Creative Cloud, offering advanced composition controls, style references, and seamless editing workflows within Photoshop and Illustrator.

image

Input

$4.00/M

Output

$35.00/M

Adobe Firefly Image 4

Adobe

mid

Adobe's fourth-generation Firefly image model offering improved quality, faster generation, and enhanced creative controls compared to its predecessors. Firefly Image 4 provides robust structure references, style transfer, and generative fill capabilities, all trained on Adobe's commercially licensed dataset to ensure IP safety for enterprise and professional use.

image

Input

$3.00/M

Output

$25.00/M

Stable Diffusion 3.5 Large

Stability AI

mid

Stability AI's largest open-source image generation model built on the Multimodal Diffusion Transformer (MMDiT) architecture. SD 3.5 Large delivers high-quality results across photorealistic and artistic styles with strong prompt adherence, accurate text rendering, and diverse composition capabilities, available under an open license for both research and commercial use.

image

Input

$0.50/M

Output

$6.50/M

Ideogram 3.0

Ideogram

mid

Ideogram's third-generation model combining exceptional photorealism with industry-leading text rendering accuracy within generated images. Ideogram 3.0 handles complex typography, logos, signs, and handwritten text with remarkable fidelity, making it the preferred choice for design professionals working on brand assets, marketing materials, and content requiring reliable in-image text.

image

Input

$2.00/M

Output

$20.00/M

PersonaPlex 7B v1

NVIDIA

mid

text

Input

Free/M

Output

Free/M

Recraft V3

Recraft

mid

Recraft's flagship image generation model that achieved the number one ranking on the HuggingFace text-to-image leaderboard, with native support for both raster and vector output formats. Recraft V3 excels at brand-consistent design, offering precise color palette control, style locking, and batch generation capabilities that make it uniquely suited for professional design systems.

image

Input

$2.00/M

Output

$20.00/M

MAI-Image-1

Microsoft

mid

Microsoft's first in-house image generation model developed by Microsoft AI, designed for integration across Microsoft's product ecosystem including Designer, Copilot, and Bing Image Creator. MAI-Image-1 focuses on safety, controllability, and consistent quality, with built-in content filtering and provenance metadata for responsible enterprise deployment.

image

Input

$2.00/M

Output

$15.00/M

MiniMax M1

MiniMax

frontier

World's first open-weight large-scale hybrid-attention reasoning model. Natively supports 1M token context.

text

Input

$0.30/M

Output

$0.90/M

Context

1.0M

Hunyuan-Large

Tencent

frontier

One of the largest open-source MoE models. Supports text sequences up to 256K tokens.

text

Input

$0.50/M

Output

$1.50/M

Context

256K

Hunyuan Image 3.0

Tencent

frontier

World's largest open-source text-to-image model using MoE architecture with 64 experts.

image

Input

$0.03/M

Output

$0.03/M

InternVL3 78B

Shanghai AI Lab

frontier

State-of-the-art open multimodal LLM scoring 72.2 on MMMU. New record among open MLLMs.

textimage

Input

$0.40/M

Output

$1.20/M

Context

128K

InternLM3 8B

Shanghai AI Lab

budget

Latest InternLM series model. Efficient for research and application development.

text

Input

$0.07/M

Output

$0.14/M

Context

128K

InternVL2.5 78B

Shanghai AI Lab

frontier

Advanced vision-language model with improved document and chart understanding capabilities.

textimage

Input

$0.40/M

Output

$1.20/M

Context

128K

Yi 1.5 9B

01.AI

mid

Mid-size Yi model with enhanced inference speed for extended prompts.

text

Input

$0.10/M

Output

$0.20/M

Context

128K

Yi 1.5 6B

01.AI

budget

Compact Yi model offering strong reasoning at minimal resource requirements.

text

Input

$0.06/M

Output

$0.12/M

Context

128K

Yi-VL 34B

01.AI

mid

Vision-language Yi model for image understanding and visual question answering.

textimage

Input

$0.30/M

Output

$0.60/M

Context

16K

StarCoder2 3B

BigCode

budget

Compact code model trained on 4T+ tokens and 600+ languages from The Stack v2.

textcode

Input

$0.03/M

Output

$0.06/M

Context

16K

StarCoder2 7B

BigCode

mid

Mid-size code model matching CodeLlama 13B quality at half the parameters.

textcode

Input

$0.07/M

Output

$0.14/M

Context

16K

StableLM 2 1.6B

Stability AI

budget

Lightweight language model for on-device inference and resource-constrained environments.

text

Input

$0.02/M

Output

$0.04/M

Context

SD 3.5 Medium

Stability AI

mid

Mid-size Stable Diffusion optimized for consumer GPUs and edge devices.

image

Input

$0.02/M

Output

$0.02/M

FLUX.2 Klein 4B

Black Forest Labs

budget

Fastest FLUX model generating and editing images in under one second. Fully open under Apache 2.0.

image

Input

$0.01/M

Output

$0.01/M

FLUX.1 Schnell

Black Forest Labs

mid

Fast open-source text-to-image model with 4-step generation. Apache 2.0 licensed.

image

Input

$0.02/M

Output

$0.02/M

FLUX.1 Pro

Black Forest Labs

frontier

Premium text-to-image model with highest technical quality and 4.5-second generation.

image

Input

$0.05/M

Output

$0.05/M

Llama Guard 3 8B

Meta

mid

Safety classification model for detecting unsafe content in LLM inputs and outputs.

text

Input

$0.05/M

Output

$0.05/M

Context

128K

Gemma 2 2B

Google

budget

Smallest Gemma 2 model for efficient text processing on consumer hardware.

text

Input

$0.02/M

Output

$0.04/M

Context

Baichuan 4

Baichuan

frontier

Premier Chinese LLM specializing in law, finance, medicine, and classical literature.

text

Input

$1.50/M

Output

$4.50/M

Context

128K

EXAONE 4.0 32B

LG AI Research

mid

Korean sovereign AI model using MoE with hybrid attention for reduced computation.

text

Input

$0.25/M

Output

$0.75/M

Context

128K

EXAONE 4.0 1.2B

LG AI Research

budget

Ultra-compact Korean AI model for on-device and mobile deployment.

text

Input

$0.02/M

Output

$0.04/M

Context

128K

Solar Pro 2

Upstage

mid

Agentic reasoning-focused model matching larger rivals. Strong multilingual capabilities.

text

Input

$0.20/M

Output

$0.60/M

Context

128K

BGE-M3

BAAI

mid

Most popular open embedding model. Multi-functionality, multi-linguality, multi-granularity in one model.

text

Input

$0.01/M

Output

$0.01/M

Context

Whisper Large V3

OpenAI

mid

Gold standard speech recognition model supporting 99+ languages. 1.55B parameter encoder-decoder architecture.

audio

Input

$0.0060/M

Output

$0.0060/M

Whisper Large V3 Turbo

OpenAI

budget

Speed-optimized Whisper variant with 6x faster inference at 809M parameters.

audio

Input

$0.0030/M

Output

$0.0030/M

DALL-E 3

OpenAI

frontier

OpenAI's image generation model excelling at precision, complex prompts, and readable text rendering within images.

image

Input

$0.04/M

Output

$0.04/M

Claude Sonnet 4.5

Anthropic

mid

High-intelligence Sonnet model with 1M token context window. Strong balance of performance and cost.

textimage

Input

$3.00/M

Output

$15.00/M

Context

1.0M

Claude Haiku 4.5

Anthropic

budget

Fastest and most cost-efficient Claude model designed for high-throughput, low-latency applications.

textimage

Input

$0.80/M

Output

$4.00/M

Context

200K

Gemini 3 Deep Think

Google

frontier

Specialized reasoning model designed for science, research, and complex engineering challenges.

textimageaudiovideo

Input

$5.00/M

Output

$15.00/M

Context

1.0M

Gemma 3 1B

Google

budget

Smallest Gemma 3 model for edge and mobile deployment. Text-only with 128K context.

text

Input

$0.02/M

Output

$0.02/M

Context

128K

PaliGemma2 28B

Google

mid

Open vision-language model for image captioning, visual QA, and OCR tasks. Built on Gemma 2 backbone.

textimage

Input

$0.30/M

Output

$0.60/M

Context

PaliGemma2 10B

Google

mid

Mid-size PaliGemma for efficient vision-language tasks. Strong OCR and document understanding.

textimage

Input

$0.15/M

Output

$0.30/M

Context

Qwen3 14B

Alibaba/Qwen

mid

Dense model with hybrid thinking/non-thinking modes. Seamless switching between complex reasoning and general dialogue.

text

Input

$0.20/M

Output

$0.60/M

Context

128K

Qwen3 4B

Alibaba/Qwen

budget

Compact Qwen3 model with hybrid reasoning for edge deployment and resource-constrained environments.

text

Input

$0.05/M

Output

$0.15/M

Context

128K

Qwen3 1.7B

Alibaba/Qwen

budget

Lightweight Qwen3 model for on-device AI applications with reasoning capability.

text

Input

$0.02/M

Output

$0.06/M

Context

128K

Qwen3 0.6B

Alibaba/Qwen

budget

Smallest Qwen3 model designed for ultra-lightweight deployment and edge inference.

text

Input

$0.01/M

Output

$0.03/M

Context

32K

Qwen2.5-VL 7B

Alibaba/Qwen

mid

Compact vision-language model excelling at video and image analysis. Top small multimodal model on Hugging Face.

textimagevideo

Input

$0.10/M

Output

$0.30/M

Context

128K

Qwen2-Audio 7B

Alibaba/Qwen

mid

Audio-language model for speech recognition, audio understanding, and music analysis.

textaudio

Input

$0.10/M

Output

$0.30/M

Context

128K

Qwen2.5-VL 3B

Alibaba/Qwen

budget

Smallest Qwen VL model for lightweight vision-language tasks on constrained hardware.

textimagevideo

Input

$0.04/M

Output

$0.12/M

Context

128K

Qwen2.5-Math 72B

Alibaba/Qwen

frontier

Math-specialized model with step-by-step reasoning for complex mathematical problem solving.

text

Input

$0.40/M

Output

$1.20/M

Context

128K

DeepSeek-VL2

DeepSeek

mid

Vision-language model for image understanding, OCR, and visual reasoning tasks.

textimage

Input

$0.14/M

Output

$0.28/M

Context

128K

DeepSeek-Math V2

DeepSeek

frontier

Math-specialized model achieving gold-level scores in math competitions. Based on V3.2 architecture.

text

Input

$0.27/M

Output

$1.10/M

Context

128K

DeepSeek-R1-Distill-Qwen-7B

DeepSeek

budget

Distilled R1 reasoning into compact Qwen-based model. Exceptional at math and programming.

text

Input

$0.07/M

Output

$0.14/M

Context

128K

DeepSeek-R1-Distill-Llama-8B

DeepSeek

budget

R1 reasoning distilled into Llama 3 architecture. Strong reasoning at minimal compute cost.

text

Input

$0.07/M

Output

$0.14/M

Context

128K

Magistral Medium

Mistral AI

mid

First Mistral reasoning model with 50% AIME-24 improvement via scalable RL. Reasoning in 8+ languages.

text

Input

$2.00/M

Output

$6.00/M

Context

128K

Magistral Small

Mistral AI

mid

Open-source reasoning model built on Small 3.1 with SFT and RL training. Efficient multilingual reasoning.

text

Input

$0.20/M

Output

$0.60/M

Context

128K

Devstral Small 2

Mistral AI

mid

Coding-specialized model outperforming Qwen 3 Coder Flash despite smaller size.

textcode

Input

$0.20/M

Output

$0.60/M

Context

128K

Ministral 3B

Mistral AI

budget

Smallest Mistral model for edge computing and extremely resource-constrained deployments.

text

Input

$0.04/M

Output

$0.10/M

Context

128K

Codestral Mamba

Mistral AI

mid

Code model using Mamba SSM architecture for linear-time inference. Unlimited theoretical context.

textcode

Input

$0.10/M

Output

$0.30/M

Context

256K

Phi-4-reasoning-plus

Microsoft

budget

Enhanced reasoning model using 1.5x more tokens for higher accuracy on complex logical tasks.

text

Input

$0.07/M

Output

$0.14/M

Context

32K

Nemotron 3 Nano

NVIDIA

budget

Hybrid Mamba-Transformer MoE with 4x higher throughput than predecessor. Open weights and training data.

text

Input

$0.04/M

Output

$0.08/M

Context

1.0M

Canary-1B-Flash

NVIDIA

budget

Speed-optimized ASR model delivering 1000+ RTFx on Open ASR Leaderboard. Exceptional accuracy.

audio

Input

$0.0040/M

Output

$0.0040/M

Aya Expanse 8B

Cohere

mid

Multilingual model covering 23 languages for global enterprise deployment.

text

Input

$0.05/M

Output

$0.15/M

Context

128K

Cohere Embed V4

Cohere

mid

State-of-the-art text embedding model for semantic search and RAG applications.

text

Input

$0.10/M

Output

$0.10/M

Context

Falcon 3 10B

TII

mid

Outperforms all models under 13B on HuggingFace leaderboard. Trained on 14T tokens with innovative 1.58-bit quantized variant.

text

Input

$0.10/M

Output

$0.30/M

Context

32K

Falcon 3 7B

TII

mid

Versatile 7B model with 30 checkpoint variants including base, instruct, and quantized.

text

Input

$0.07/M

Output

$0.21/M

Context

32K

Nomic Embed V2

Nomic AI

mid

First MoE embedding model. Trained on 1.6B pairs across ~100 languages with top-2 expert routing.

text

Input

$0.01/M

Output

$0.01/M

Context

Jina Embeddings V4

Jina AI

mid

Universal multimodal embedding handling text, images, and documents in 30+ languages.

textimage

Input

$0.02/M

Output

$0.02/M

Context

Amazon Nova 2 Lite

Amazon

mid

Fast, cost-effective reasoning model with built-in code interpreter and web grounding.

textimagevideo

Input

$0.80/M

Output

$2.40/M

Context

1.0M

Amazon Nova 2 Sonic

Amazon

mid

Speech-to-speech model for natural real-time conversations. Supports 7 languages.

audio

Input

$0.50/M

Output

$0.50/M

Amazon Nova Canvas

Amazon

mid

Image generation model with fine-grained control over composition, style, and content.

image

Input

$0.04/M

Output

$0.04/M

Apple Intelligence 3B

Apple

budget

On-device model optimized for Apple silicon with 2-bit quantization. Powers Siri and Apple Intelligence.

text

Input

Free/M

Output

Free/M

Context

Reka Core

Reka AI

frontier

Full multimodal model handling text, image, video, and audio inputs natively.

textimagevideoaudio

Input

$3.00/M

Output

$9.00/M

Context

128K

Reka Flash

Reka AI

mid

One of the few 21B models supporting full interleaved multimodal inputs. Videos up to 5 minutes.

textimagevideoaudio

Input

$0.80/M

Output

$2.40/M

Context

128K

Mochi 1

Genmo

mid

High-performance open text-to-video model excelling in text consistency.

video

Input

$0.05/M

Output

$0.05/M

Granite 3.1 8B

IBM

mid

Enterprise-grade model with strong instruction following for business applications.

text

Input

$0.10/M

Output

$0.20/M

Context

128K

Granite 3.1 2B

IBM

budget

Compact enterprise model for edge deployment and lightweight business tasks.

text

Input

$0.03/M

Output

$0.06/M

Context

128K

Granite 3.2 8B

IBM

mid

Updated Granite with enhanced coding and tool-use capabilities for enterprise automation.

textcode

Input

$0.10/M

Output

$0.20/M

Context

128K

Granite 3.2 2B

IBM

budget

Small enterprise model with coding support for lightweight automation workflows.

textcode

Input

$0.03/M

Output

$0.06/M

Context

128K

MiniCPM-V 2.6

OpenBMB

mid

Efficient vision-language model rivaling GPT-4V quality at a fraction of the size.

textimage

Input

$0.10/M

Output

$0.20/M

Context

128K

SmolLM2 1.7B

Hugging Face

budget

Compact LLM designed for on-device AI. Surprisingly capable for its tiny size.

text

Input

$0.01/M

Output

$0.02/M

Context

SmolLM2 360M

Hugging Face

budget

Tiny but functional language model for extreme resource constraints and research.

text

Input

$0.0050/M

Output

$0.01/M

Context

HyperCLOVA X

Naver

mid

Korean sovereign AI with omnimodal capabilities. Specialized for Korean language and culture.

textimage

Input

$1.00/M

Output

$3.00/M

Context

128K

Ministral 14B

Mistral AI

mid

Mid-size Mistral model bridging the gap between 8B edge models and large frontier offerings.

text

Input

$0.15/M

Output

$0.45/M

Context

128K

Qwen2.5-Math 7B

Alibaba/Qwen

mid

Compact math-specialized model with chain-of-thought reasoning for mathematical problem solving.

text

Input

$0.07/M

Output

$0.14/M

Context

128K

Phi-3.5 Vision

Microsoft

mid

Lightweight multimodal model with vision capabilities for on-device and edge visual understanding.

textimage

Input

$0.05/M

Output

$0.10/M

Context

128K

Amazon Nova Reel

Amazon

mid

Amazon's video generation model producing high-quality short clips for advertising and social media.

video

Input

$0.04/M

Output

$0.04/M

Llama 3.1 Nemotron 70B

NVIDIA

frontier

NVIDIA-tuned Llama 3.1 with reward-model-guided alignment. Excels at instruction following and helpful responses.

text

Input

$0.35/M

Output

$1.05/M

Context

128K

Command R 7B

Cohere

budget

Cohere's smallest Command model optimized for RAG, tool use, and multilingual enterprise applications.

text

Input

$0.04/M

Output

$0.08/M

Context

128K

GLM-4.6

Zhipu AI

frontier

Zhipu's latest generation model with improved reasoning, coding, and multilingual capabilities.

text

Input

$1.50/M

Output

$4.50/M

Context

128K

Gemini 2 Flash Thinking

Google

mid

Experimental Gemini model with extended chain-of-thought reasoning. Transparent thinking process with strong performance on math and science.

textimage

Input

$0.15/M

Output

$0.60/M

Context

1.0M

Aya Vision 32B

Cohere

mid

Cohere's open multimodal model for visual understanding across 23 languages. Strong image captioning and visual QA.

textimage

Input

$0.25/M

Output

$0.50/M

Context

128K

GPT-4o Mini

OpenAI

budget

A smaller, faster, and more affordable version of GPT-4o. Great for lightweight tasks.

textimage

Input

$0.15/M

Output

$0.60/M

Context

128K

Claude Haiku 3.5

Anthropic

budget

Anthropic's fastest and most affordable model. Great for high-volume, low-latency tasks.

textimage

Input

$0.80/M

Output

$4.00/M

Context

200K

Gemini 1.5 Pro

Google

mid

Google's previous-gen flagship model with the longest context window in production.

textimageaudiovideo

Input

$1.25/M

Output

$5.00/M

Context

2.1M

Llama 3.1 405B

Meta

frontier

Meta's largest open-source model. Competitive with frontier closed-source models.

text

Input

$0.80/M

Output

$0.80/M

Context

128K

Mistral Large

Mistral AI

mid

Mistral's flagship model with strong multilingual and code generation capabilities.

text

Input

$2.00/M

Output

$6.00/M

Context

128K

Mistral Small

Mistral AI

budget

Mistral's efficient model for everyday tasks. Fast and cost-effective.

text

Input

$0.10/M

Output

$0.30/M

Context

32K

Grok-2

xAI

frontier

xAI's large language model with real-time X (Twitter) data access and strong reasoning.

textimage

Input

$2.00/M

Output

$10.00/M

Context

131K

Command R+

Cohere

mid

Cohere's enterprise-grade model optimized for RAG, tool use, and business workflows.

text

Input

$2.50/M

Output

$10.00/M

Context

128K

GPT-4.1 Mini

OpenAI

budget

A fast, affordable variant of GPT-4.1 for high-volume workloads.

textimage

Input

$0.40/M

Output

$1.60/M

Context

1.0M

Kimi K2 Thinking

Moonshot AI

frontier

Moonshot AI's reasoning-focused MoE model with chain-of-thought capabilities. 1T total params, 32B active.

textcode

Input

$0.47/M

Output

$2.00/M

Context

131K

Qwen3.5 Plus

Alibaba/Qwen

frontier

Hosted version of Qwen3.5 397B with 1M context window and adaptive thinking for complex tasks.

textcode

Input

$0.40/M

Output

$2.40/M

Context

1.0M

Qwen3 Max Thinking

Alibaba/Qwen

frontier

Alibaba's large-scale reasoning model with ~1T parameters and chain-of-thought capabilities.

textcode

Input

$1.20/M

Output

$6.00/M

Context

256K

Qwen3 Coder Next

Alibaba/Qwen

mid

Alibaba's efficient code-focused MoE model. 80B total params, 3B active, Apache 2.0 licensed.

textcode

Input

$0.12/M

Output

$0.75/M

Context

256K

GPT-4.1 Nano

OpenAI

budget

OpenAI's fastest and cheapest model. Ideal for classification, autocomplete, and high-throughput tasks.

textimage

Input

$0.10/M

Output

$0.40/M

Context

1.0M

GPT-4.5 Preview

OpenAI

frontier

OpenAI's research preview with improved emotional intelligence and reduced hallucinations.

textimage

Input

$75.00/M

Output

$150.00/M

Context

128K

Claude 3.5 Sonnet v2

Anthropic

mid

Upgraded Claude 3.5 Sonnet with major coding and tool-use improvements, plus computer use capability.

textimage

Input

$3.00/M

Output

$15.00/M

Context

200K

Gemma 3 27B

Google

mid

Google's open-source multimodal model. Strong performance for its size with vision capabilities.

textimage

Input

$0.10/M

Output

$0.10/M

Context

128K

Gemma 3 12B

Google

budget

Efficient open-source model from Google with multimodal capabilities at 12B parameters.

textimage

Input

$0.05/M

Output

$0.05/M

Context

128K

Gemma 3 4B

Google

budget

Ultra-efficient open-source model from Google. Runs on mobile and edge devices.

textimage

Input

$0.02/M

Output

$0.02/M

Context

128K

Gemma 2 27B

Google

mid

Google's previous-gen open-source model with strong general capabilities.

text

Input

$0.07/M

Output

$0.07/M

Context

Gemma 2 9B

Google

budget

Efficient open-source model from Google. Great performance-to-size ratio.

text

Input

$0.03/M

Output

$0.03/M

Context

CodeGemma 7B

Google

budget

Google's open-source code-focused model based on the Gemma architecture.

code

Input

$0.03/M

Output

$0.03/M

Context

Llama 3.2 90B Vision

Meta

mid

Meta's largest multimodal Llama model with image understanding capabilities.

textimage

Input

$0.35/M

Output

$0.40/M

Context

128K

Llama 3.2 11B Vision

Meta

budget

Efficient multimodal Llama model for image + text tasks at 11B parameters.

textimage

Input

$0.06/M

Output

$0.06/M

Context

128K

Llama 3.2 3B

Meta

budget

Ultra-lightweight Llama model for edge deployment and mobile applications.

text

Input

$0.01/M

Output

$0.01/M

Context

128K

Llama 3.2 1B

Meta

budget

The smallest Llama model for on-device inference and constrained environments.

text

Input

$0.01/M

Output

$0.01/M

Context

128K

Llama 3.1 8B

Meta

budget

Meta's efficient open-source base model. Excellent for fine-tuning and custom deployments.

text

Input

$0.05/M

Output

$0.05/M

Context

128K

Llama 3.1 70B

Meta

mid

Meta's strong mid-range open-source model, predecessor to 3.3 with broad community support.

text

Input

$0.18/M

Output

$0.18/M

Context

128K

Code Llama 70B

Meta

mid

Meta's largest code-focused open-source model. Specialized for code generation and understanding.

code

Input

$0.18/M

Output

$0.18/M

Context

16K

Codestral

Mistral AI

mid

Mistral's first code-focused model with 32K context. Supports 80+ programming languages.

code

Input

$0.30/M

Output

$0.90/M

Context

32K

Mistral NeMo

Mistral AI

budget

Mistral's 12B open-source model co-developed with NVIDIA. Replaces Mistral 7B.

text

Input

$0.04/M

Output

$0.04/M

Context

128K

Pixtral 12B

Mistral AI

budget

Mistral's open-source multimodal model. Processes images natively alongside text.

textimage

Input

$0.10/M

Output

$0.10/M

Context

128K

Pixtral Large

Mistral AI

mid

Mistral's flagship multimodal model. Built on Mistral Large with vision capabilities.

textimage

Input

$2.00/M

Output

$6.00/M

Context

128K

Mixtral 8x22B

Mistral AI

mid

Mistral's large open-source MoE model with 176B total params. Strong coding and reasoning.

textcode

Input

$0.65/M

Output

$0.65/M

Context

66K

Mixtral 8x7B

Mistral AI

budget

The original open-source MoE model that started the MoE trend. Fast and efficient.

text

Input

$0.24/M

Output

$0.24/M

Context

32K

Ministral 8B

Mistral AI

budget

Mistral's edge-optimized model with a knowledge-dense 8B parameter design.

text

Input

$0.10/M

Output

$0.10/M

Context

128K

Mistral 7B

Mistral AI

budget

The model that launched Mistral. Open-source, fast, and surprisingly capable for 7B.

text

Input

$0.06/M

Output

$0.06/M

Context

32K

DeepSeek-Coder-V2

DeepSeek

mid

DeepSeek's open-source code-focused MoE model. Competitive with GPT-4 Turbo on coding.

code

Input

$0.14/M

Output

$0.28/M

Context

128K

DeepSeek-V2.5

DeepSeek

mid

Merged general and coder capabilities from V2 into a unified model.

textcode

Input

$0.14/M

Output

$0.28/M

Context

128K

DeepSeek-R1-Distill-Qwen-32B

DeepSeek

mid

R1 reasoning capabilities distilled into a compact Qwen-based 32B model.

text

Input

$0.12/M

Output

$0.18/M

Context

128K

Command R

Cohere

mid

Cohere's open-weight model optimized for RAG and tool use. Strong multilingual support.

text

Input

$0.15/M

Output

$0.60/M

Context

128K

Aya Expanse 32B

Cohere

mid

Cohere's open-source multilingual model covering 23 languages with strong performance.

text

Input

$0.50/M

Output

$1.50/M

Context

128K