Model Directory

OpenAI's reasoning model with chain-of-thought capabilities for complex problem solving.

Input

$15.00/M

Output

$60.00/M

Context

200K

Claude Opus 4

Anthropic

Anthropic's most powerful model. Top-tier performance on coding, analysis, and complex reasoning tasks.

Input

$15.00/M

Output

$75.00/M

Context

200K

Gemini 2.5 Pro

Google

Google's most capable thinking model with breakthrough performance on reasoning and coding.

textimageaudiovideocode

Input

$1.25/M

Output

$10.00/M

Context

1.0M

GPT-4.1

OpenAI

OpenAI's latest GPT-4 series model with improved coding, instruction following, and long context.

Input

$2.00/M

Output

$8.00/M

Context

1.0M

Kimi K2.5

Moonshot AI

Moonshot AI's frontier multimodal MoE model with 1T total parameters (32B active). Tops SWE-bench and AIME 2025 benchmarks.

Input

$0.45/M

Output

$2.20/M

Context

256K

Qwen3.5 397B

Alibaba/Qwen

Alibaba's open-weight hybrid MoE model with 512 experts and 17B active parameters. Natively multimodal with 201 language support. Top scores on GPQA and SWE-bench.

textimagevideocode

Input

$0.15/M

Output

$1.00/M

Context

256K

o3

OpenAI

OpenAI's most powerful reasoning model with breakthrough performance on math and coding benchmarks.

Input

$10.00/M

Output

$40.00/M

Context

200K

Llama 4 Maverick

Grok-3

xAI

xAI's frontier model trained on Colossus supercluster. Real-time data access and strong reasoning.

Input

$3.00/M

Output

$15.00/M

Context

131K

Sora 2

OpenAI

OpenAI's second-generation video synthesis model capable of producing cinematic-quality videos up to 60 seconds long with synchronized audio. Built on an advanced Diffusion Transformer (DiT) architecture, Sora 2 excels at complex scene composition, realistic physics simulation, and coherent multi-character narratives with natural dialogue and ambient sound.

Input

$5.00/M

Output

$100.00/M

Veo 3

Google

Google DeepMind's flagship video generation model that natively produces joint audio-visual output in a single pass. Veo 3 leverages a Latent Diffusion Transformer to generate high-fidelity clips with synchronized dialogue, sound effects, and ambient audio without requiring a separate audio model. It demonstrates strong physical understanding and prompt adherence across diverse cinematic styles.

Input

$5.00/M

Output

$150.00/M

Runway Gen-4.5

Runway

Runway's latest flagship model built on a novel Autoregressive-to-Diffusion (A2D) hybrid architecture that first plans scene structure autoregressively and then renders frames through diffusion. Gen-4.5 achieves state-of-the-art temporal coherence, photorealistic detail, and nuanced control over motion dynamics, lighting, and artistic style.

video

Input

$5.00/M

Output

$130.00/M

Kling 3.0

Kuaishou

The latest entry in Kuaishou's Kling series, introducing multi-shot sequence generation and an AI Director mode that automatically plans camera angles, transitions, and pacing. Kling 3.0 produces broadcast-quality video with native audio synthesis, making it suitable for short-form content creation, advertising, and virtual production workflows.

Input

$3.00/M

Output

$60.00/M

Seedance 2.0

ByteDance

ByteDance's unified multimodal generation model that handles video, audio, and image synthesis within a single architecture. Seedance 2.0 produces highly coherent audiovisual content with strong temporal consistency, supporting diverse creative workflows from music video generation to product advertisement creation with synchronized narration and effects.

videoaudioimage

Input

$3.00/M

Output

$70.00/M

MiniMax M2.5

MiniMax

Achieves 80.2% on SWE-Bench Verified matching Opus 4.6 at 1/20th cost. First on Multi-SWE-Bench at 51.3%.

Input

$0.25/M

Output

$0.75/M

Context

128K

GPT Image 1

OpenAI

OpenAI's native image generation capability integrated directly into GPT-4o, enabling conversational image creation and iterative editing through natural language. GPT Image 1 excels at accurate text rendering within images, complex multi-element compositions, and faithful adherence to detailed prompts across photorealistic, illustrative, and artistic styles.

Input

$10.00/M

Output

$40.00/M

Imagen 4

Google

Google DeepMind's fourth-generation image synthesis model capable of producing images up to 2K resolution with exceptional photorealism and compositional accuracy. Imagen 4 includes SynthID watermarking by default for responsible AI deployment, supports advanced inpainting and outpainting, and demonstrates industry-leading performance on text rendering and spatial reasoning tasks.

Input

$4.00/M

Output

$20.00/M

FLUX.2 Pro

Black Forest Labs

Black Forest Labs' flagship commercial image generation model with 32 billion parameters, delivering up to 4-megapixel resolution output with exceptional detail and prompt fidelity. FLUX.2 Pro achieves state-of-the-art results in photorealism, typography rendering, and complex scene composition, making it a top choice for professional creative applications.

Input

$3.00/M

Output

$30.00/M

Most capable open VLM rivaling GPT-5 across multimodal benchmarks. Strong reasoning and agentic capabilities.

textimagevideo

Input

$0.30/M

Output

$0.60/M

Context

128K

DeepSeek-V3.1

DeepSeek

Hybrid model combining V3 and R1 strengths. Improved reasoning with RL techniques from R1.

Input

$0.27/M

Output

$1.10/M

Context

128K

GPT-5.2-Codex

OpenAI

OpenAI's agentic coding model with context compaction and long-horizon task completion. First model in the GPT-5 Codex series.

Input

$1.75/M

Output

$14.00/M

Context

400K

GPT-5.3-Codex

OpenAI

OpenAI's most capable coding model combining Codex and GPT-5 training stacks. Agentic coding, research, and tool use with 77.3% on Terminal-Bench 2.0.

Input

$2.00/M

Output

$16.00/M

Context

400K

Claude Opus 4.6

Anthropic

Anthropic's strongest reasoning and coding model. 80.8% on SWE-bench Verified, 1M context (beta), agent teams, and extended thinking.

Input

$5.00/M

Output

$25.00/M

Context

1.0M

Claude Sonnet 4.6

Anthropic

Matches Opus 4.6 on most benchmarks at 1/5 the cost. 79.6% on SWE-bench, 1M context, computer use, and design capabilities.

Input

$3.00/M

Output

$15.00/M

Context

1.0M

Gemini 3.1 Pro

Google

Google's most capable model. 94.3% on GPQA Diamond, 80.6% on SWE-bench, 77.1% on ARC-AGI-2. #1 on 12 of 18 tracked benchmarks.

textimageaudiovideocode

Input

$2.00/M

Output

$12.00/M

Context

1.0M

DeepSeek V4

DeepSeek

DeepSeek's 1T parameter coding-focused model with 1M+ context. Three architectural innovations: Manifold-Constrained Hyper-Connections, Engram memory, Sparse Attention.

Input

$0.10/M

Output

$0.40/M

Context

1.0M

Amazon Nova 2 Pro

Amazon

Most intelligent Amazon model for complex multi-step reasoning and agentic workflows.

textimagevideoaudio

Input

$4.00/M

Output

$12.00/M

Context

1.0M

DALL-E 3

OpenAI

OpenAI's image generation model excelling at precision, complex prompts, and readable text rendering within images.

Input

$0.04/M

Output

$0.04/M

Llama 3.1 Nemotron 70B

NVIDIA

NVIDIA-tuned Llama 3.1 with reward-model-guided alignment. Excels at instruction following and helpful responses.

Input

$0.35/M

Output

$1.05/M

Context

Grok-2

xAI

xAI's large language model with real-time X (Twitter) data access and strong reasoning.

Input

$2.00/M

Output

$10.00/M

Context

131K

GLM-4.6

Zhipu AI

Zhipu's latest generation model with improved reasoning, coding, and multilingual capabilities.

Input

$1.50/M

Output

$4.50/M

Context

128K

Alibaba's large-scale reasoning model with ~1T parameters and chain-of-thought capabilities.

Input

$1.20/M

Output

$6.00/M

Context

256K

Hunyuan-Large

Tencent

One of the largest open-source MoE models. Supports text sequences up to 256K tokens.

Input

$0.50/M

Output

$1.50/M

Context

256K

GPT-4.5 Preview

OpenAI

OpenAI's research preview with improved emotional intelligence and reduced hallucinations.

Input

$75.00/M

Output

$150.00/M

Context

128K

Hunyuan Image 3.0

Tencent

World's largest open-source text-to-image model using MoE architecture with 64 experts.

Input

$0.03/M

Output

$0.03/M

InternVL3 78B

Shanghai AI Lab

State-of-the-art open multimodal LLM scoring 72.2 on MMMU. New record among open MLLMs.

Input

$0.40/M

Output

$1.20/M

Context

128K

Qwen3 235B

Alibaba/Qwen

Alibaba's large-scale open-source MoE model with thinking mode support.

Input

$0.20/M

Output

$0.60/M

Context

128K

Nemotron-4 340B

NVIDIA

NVIDIA's large open-source model trained for synthetic data generation.

Input

$1.20/M

Output

$1.20/M

Context

Yi-Large

01.AI

01.AI's frontier closed-source model with top-tier multilingual performance.

Input

$3.00/M

Output

$9.00/M

Context

32K

GLM-4.7

Zhipu AI

Zhipu AI's latest open-weight MoE model with interleaved thinking and state-of-the-art coding performance.

Input

$0.50/M

Output

$1.50/M

Context

200K

o3-pro

OpenAI

OpenAI's highest-quality reasoning model with extended compute for complex scientific and mathematical problems.

Input

$20.00/M

Output

$80.00/M

Context

200K

Mistral Large 3

Mistral AI

Mistral's open-weight 675B MoE model with 41B active parameters, multimodal input, and 256K context.

Input

$0.50/M

Output

$1.50/M

Context

256K

DeepSeek-V3.2

DeepSeek

Unified reasoning and non-reasoning model that merges DeepSeek-V3 and R1 capabilities into a single architecture.

Input

$0.28/M

Output

$0.42/M

Context

128K

Grok 4.20

xAI

xAI's 4-agent parallel collaboration system with rapid learning architecture and medical document analysis. Beta release.

Input

$3.00/M

Output

$15.00/M

Context

131K

Molmo 72B

Allen AI

Open multimodal model for visual understanding, image captioning, and visual question answering.

Input

$0.40/M

Output

$1.20/M

Context

128K

GLM-5

Zhipu AI

Zhipu's largest text generation model at 754B parameters.

Input

$2.00/M

Output

$6.00/M

Context

256K

InternVL2.5 78B

Shanghai AI Lab

Advanced vision-language model with improved document and chart understanding capabilities.

Input

$0.40/M

Output

$1.20/M

Context

128K

FLUX.1 Pro

Black Forest Labs

Premium text-to-image model with highest technical quality and 4.5-second generation.

Input

$0.05/M

Output

$0.05/M

Veo 3.1

Google

An enhanced iteration of Google DeepMind's Veo series that produces 8-second clips that can be seamlessly extended up to 148 seconds through iterative generation. Veo 3.1 improves temporal consistency over long sequences, delivers higher resolution output, and refines audio synchronization for extended storytelling and commercial content production.

Input

$3.00/M

Output

$80.00/M

Runway Gen-4 Turbo

Runway

Runway's high-performance video generation model optimized for professional content creation at up to 4K resolution. Gen-4 Turbo maintains consistent characters and environments across shots, supports detailed camera control, and delivers studio-grade output with dramatically reduced generation times compared to its predecessors.

video

Input

$5.00/M

Output

$120.00/M

Baichuan 4

Baichuan

Premier Chinese LLM specializing in law, finance, medicine, and classical literature.

Input

$1.50/M

Output

$4.50/M

Context

128K

Kling 2.6

Kuaishou

Kuaishou's advanced video generation model capable of simultaneous audio-visual synthesis, producing clips with fully synchronized dialogue, music, and environmental sounds. Kling 2.6 excels at generating realistic human motion, facial expressions, and complex multi-object interactions while maintaining strong temporal consistency across extended sequences.