Model Directory

DeepSeek's open-source MoE model rivaling frontier models at a fraction of the cost.

Input

$0.27/M

Output

$1.10/M

Context

128K

Qwen3-Coder 480B

Alibaba/Qwen

Specialized code model trained on 7.5T tokens (70% code). Supports 100+ programming languages and agentic workflows.

Input

$0.30/M

Output

$0.60/M

Context

262K

DeepSeek V4

DeepSeek

DeepSeek's 1T parameter coding-focused model with 1M+ context. Three architectural innovations: Manifold-Constrained Hyper-Connections, Engram memory, Sparse Attention.

Input

$0.10/M

Output

$0.40/M

Context

1.0M

Qwen3-VL 235B

Alibaba/Qwen

Most capable open VLM rivaling GPT-5 across multimodal benchmarks. Strong reasoning and agentic capabilities.

textimagevideo

Input

$0.30/M

Output

$0.60/M

Context

128K

Llama 4 Scout

Meta

Meta's latest open-source MoE model with 17B active parameters and industry-leading 10M token context.

Input

$0.15/M

Output

$0.60/M

Context

10.5M

Llama 4 Maverick

Meta

Meta's powerful open-source MoE model with 400B total params and 1M context window.

Input

$0.50/M

Output

$2.00/M

Context

1.0M

MiniMax M2.5

MiniMax

Achieves 80.2% on SWE-Bench Verified matching Opus 4.6 at 1/20th cost. First on Multi-SWE-Bench at 51.3%.

Input

$0.25/M

Output

$0.75/M

Context

128K

DeepSeek-R1

DeepSeek

DeepSeek's reasoning model with transparent chain-of-thought. Open-source and highly competitive.

Input

$0.55/M

Output

$2.19/M

Context

128K

DeepSeek-V3.1

DeepSeek

Hybrid model combining V3 and R1 strengths. Improved reasoning with RL techniques from R1.

Input

$0.27/M

Output

$1.10/M

Context

128K

Qwen3.5 397B

Alibaba/Qwen

Alibaba's open-weight hybrid MoE model with 512 experts and 17B active parameters. Natively multimodal with 201 language support. Top scores on GPQA and SWE-bench.

textimagevideocode

Input

$0.15/M

Output

$1.00/M

Context

256K

Llama 3.3 70B

Meta

Meta's open-source model matching GPT-4 class performance at 70B parameters.

Input

$0.18/M

Output

$0.18/M

Context

128K

Llama 3.2 3B

Meta

Ultra-lightweight Llama model for edge deployment and mobile applications.

Input

$0.01/M

Output

$0.01/M

Context

128K

Llama 3.2 1B

Meta

The smallest Llama model for on-device inference and constrained environments.

Input

$0.01/M

Output

$0.01/M

Context

128K

Llama 3.1 8B

Meta

Meta's efficient open-source base model. Excellent for fine-tuning and custom deployments.

Input

$0.05/M

Output

$0.05/M

Context

128K

Llama 3.1 70B

Meta

Meta's strong mid-range open-source model, predecessor to 3.3 with broad community support.

Input

$0.18/M

Output

$0.18/M

Context

128K

Code Llama 70B

Meta

Meta's largest code-focused open-source model. Specialized for code generation and understanding.

Input

$0.18/M

Output

$0.18/M

Context

16K

Codestral

Mistral AI

Mistral's first code-focused model with 32K context. Supports 80+ programming languages.

Input

$0.30/M

Output

$0.90/M

Context

32K

Mistral NeMo

Mistral AI

Mistral's 12B open-source model co-developed with NVIDIA. Replaces Mistral 7B.

Input

$0.04/M

Output

$0.04/M

Context

128K

Pixtral 12B

Mistral AI

Mistral's open-source multimodal model. Processes images natively alongside text.

Input

$0.10/M

Output

$0.10/M

Context

128K

Mixtral 8x22B

Mistral AI

Mistral's large open-source MoE model with 176B total params. Strong coding and reasoning.

Input

$0.65/M

Output

$0.65/M

Context

66K

Mixtral 8x7B

Mistral AI

The original open-source MoE model that started the MoE trend. Fast and efficient.

Input

$0.24/M

Output

$0.24/M

Context

32K

Mistral 7B

Mistral AI

The model that launched Mistral. Open-source, fast, and surprisingly capable for 7B.

Input

$0.06/M

Output

$0.06/M

Context

32K

DeepSeek-Coder-V2

DeepSeek

DeepSeek's open-source code-focused MoE model. Competitive with GPT-4 Turbo on coding.

Input

$0.14/M

Output

$0.28/M

Context

128K

DeepSeek-V2.5

DeepSeek

Merged general and coder capabilities from V2 into a unified model.

Input

$0.14/M

Output

$0.28/M

Context

128K

DeepSeek-R1-Distill-Qwen-32B

DeepSeek

R1 reasoning capabilities distilled into a compact Qwen-based 32B model.

Input

$0.12/M

Output

$0.18/M

Context

128K

Command R

Cohere

Cohere's open-weight model optimized for RAG and tool use. Strong multilingual support.

Input

$0.15/M

Output

$0.60/M

Context

128K

Aya Expanse 32B

Cohere

Cohere's open-source multilingual model covering 23 languages with strong performance.

Input

$0.50/M

Output

$1.50/M

Context

128K

Qwen2.5 72B

Alibaba/Qwen

Alibaba's flagship open-source model. Competitive with Llama 3.1 405B at a fraction of the size.

Input

$0.30/M

Output

$0.30/M

Context

128K

Qwen2.5 32B

Alibaba/Qwen

Strong mid-range open-source model from Alibaba with broad capabilities.

Input

$0.08/M

Output

$0.08/M

Context

128K

Qwen2.5 14B

Alibaba/Qwen

Efficient open-source model balancing capability and speed.

Input

$0.05/M

Output

$0.05/M

Context

128K

Qwen2.5 7B

Alibaba/Qwen

Compact open-source model for edge deployment and fine-tuning.

Input

$0.03/M

Output

$0.03/M

Context

32K

Qwen2.5-Coder 32B

Alibaba/Qwen

Alibaba's open-source coding specialist. Matches GPT-4o on code benchmarks.

Input

$0.08/M

Output

$0.08/M

Context

128K

Qwen2.5-Coder 7B

Alibaba/Qwen

Compact open-source coding model with impressive code generation capabilities.

Input

$0.03/M

Output

$0.03/M

Context

128K

Qwen2-VL 72B

Alibaba/Qwen

Alibaba's open-source vision-language model with video understanding capabilities.

textimagevideo

Input

$0.40/M

Output

$0.40/M

Context

32K

QwQ 32B

Alibaba/Qwen

Alibaba's open-source reasoning model with transparent chain-of-thought. Competitive with o1-mini.

Input

$0.10/M

Output

$0.30/M

Context

32K

Qwen3 235B

Alibaba/Qwen

Alibaba's large-scale open-source MoE model with thinking mode support.

Input

$0.20/M

Output

$0.60/M

Context

128K

Phi-4

Microsoft

Microsoft's 14B open-source model with training innovations that punch above its weight class.

Input

$0.04/M

Output

$0.04/M

Context

16K

Phi-3.5 Mini

Microsoft

Microsoft's compact open-source model with 128K context. Great for on-device inference.

Input

$0.01/M

Output

$0.01/M

Context

128K

Phi-3.5 MoE

Microsoft

Microsoft's open-source MoE model with 42B total params and only 6.6B active.

Input

$0.06/M

Output

$0.06/M

Context

128K

Phi-3 Medium

Microsoft

Microsoft's 14B open-source model with 128K context and strong reasoning capabilities.

Input

$0.04/M

Output

$0.04/M

Context

128K

WizardLM-2 8x22B

Microsoft

Microsoft's instruction-tuned MoE model based on Mixtral. Strong on complex reasoning tasks.

Input

$0.65/M

Output

$0.65/M

Context

66K

Nemotron 70B

NVIDIA

NVIDIA's optimized Llama 3.1 variant with custom reward model training.

Input

$0.18/M

Output

$0.18/M

Context

128K

Nemotron-4 340B

NVIDIA

NVIDIA's large open-source model trained for synthetic data generation.

Input

$1.20/M

Output

$1.20/M

Context

Jamba 1.5 Large

AI21 Labs

AI21's hybrid SSM-Transformer model with 256K context. Novel Mamba architecture.

Input

$2.00/M

Output

$8.00/M

Context

256K

Jamba 1.5 Mini

AI21 Labs

Compact version of Jamba with hybrid SSM-Transformer architecture.

Input

$0.20/M

Output

$0.40/M

Context

256K

Falcon 180B

TII

TII's largest open-source model. One of the first truly open 180B parameter models.

Input

$0.80/M

Output

$0.80/M

Context

Falcon 2 11B

TII

TII's efficient open-source model with multimodal capabilities.

Input

$0.04/M

Output

$0.04/M

Context

Yi-1.5 34B

01.AI

01.AI's open-source 34B model with strong bilingual (English/Chinese) capabilities.

Input

$0.10/M

Output

$0.10/M

Context

InternLM2.5 20B

Shanghai AI Lab

Open-source model with 1M context from Shanghai AI Lab. Strong coding and math skills.

Input

$0.06/M

Output

$0.06/M

Context

1.0M

InternVL2 26B

Shanghai AI Lab

Open-source vision-language model with strong image understanding capabilities.

Input

$0.08/M

Output

$0.08/M

Context

StableLM 2 12B

Stability AI

Stability AI's open-source language model with multilingual support.

Input

$0.04/M

Output

$0.04/M

Context

OLMo 2 13B

Allen AI

Fully open-source model from Allen AI with open training data, code, and weights.

Input

$0.04/M

Output

$0.04/M

Context

StarCoder2 15B

BigCode

Open-source code model from BigCode/HuggingFace trained on The Stack v2.

Input

$0.04/M

Output

$0.04/M

Context

16K

Arctic

Snowflake

Snowflake's open-source enterprise MoE model optimized for SQL and business tasks.

Input

$0.30/M

Output

$0.30/M

Context

DBRX

Databricks

Databricks' open-source MoE model with strong code and reasoning capabilities.

Input

$0.75/M

Output

$0.75/M

Context

32K

GLM-4.7

Zhipu AI

Zhipu AI's latest open-weight MoE model with interleaved thinking and state-of-the-art coding performance.

Input

$0.50/M

Output

$1.50/M

Context

200K

Mistral Large 3

Mistral AI

Mistral's open-weight 675B MoE model with 41B active parameters, multimodal input, and 256K context.

Input

$0.50/M

Output

$1.50/M

Context

256K

Mistral Small 3.1

Mistral AI

Compact 24B model with image understanding, 128K context, and Apache 2.0 license.

Input

$0.10/M

Output

$0.30/M

Context

128K

Phi-4-mini

Microsoft

Microsoft's 3.8B parameter model with 128K context, strong reasoning capability for on-device deployment.

Input

$0.01/M

Output

$0.01/M

Context

128K

Phi-4-multimodal

Microsoft

Microsoft's 5.6B compact model unifying text, vision, and speech in a single architecture.

textimageaudio

Input

$0.02/M

Output

$0.02/M

Context

128K

Phi-4-reasoning

Microsoft

Chain-of-thought reasoning variant of Phi-4, competitive with much larger models on math and logic tasks.

Input

$0.04/M

Output

$0.04/M

Context

32K

DeepSeek-V3.2

DeepSeek

Unified reasoning and non-reasoning model that merges DeepSeek-V3 and R1 capabilities into a single architecture.

Input

$0.28/M

Output

$0.42/M

Context

128K

DeepSeek-R1-Distill-Llama-70B

DeepSeek

R1's reasoning capability distilled into a Llama 3.1 70B architecture for efficient deployment.

Input

$0.18/M

Output

$0.18/M

Context

128K

Command A

Cohere

Cohere's 111B parameter model supporting 23 languages with enterprise tool use and 256K context.

Input

$2.50/M

Output

$10.00/M

Context

256K

Qwen3 32B

Alibaba/Qwen

Alibaba's dense 32B model with dual thinking/non-thinking modes and strong reasoning performance.

Input

$0.08/M

Output

$0.20/M

Context

131K

Qwen3 8B

Alibaba/Qwen

Compact 8B model from the Qwen3 family with thinking mode support and strong efficiency for on-device use.

Input

$0.03/M

Output

$0.06/M

Context

131K

Qwen3 30B-A3B

Alibaba/Qwen

Ultra-efficient MoE model with 128 experts and only 3.3B active parameters, ideal for cost-sensitive deployments.

Input

$0.02/M

Output

$0.04/M

Context

131K

Tiny Aya

Cohere

Cohere's compact multilingual model supporting 70+ languages. Runs on consumer devices including phones. Outperforms Gemma3-4B in 46/61 languages.

Input

$0.01/M

Output

$0.01/M

Context

32K

Falcon 3 3B

TII

Compact Falcon model for resource-constrained deployments with strong reasoning.

Input

$0.04/M

Output

$0.08/M

Context

32K

Falcon 3 1B

TII

Smallest Falcon model for edge inference and mobile deployment.

Input

$0.02/M

Output

$0.04/M

Context

32K

OLMo 3 32B

Allen AI

Fully open model with all components public: data, code, weights, and checkpoints. Instruct, Think, and RL Zero variants.

Input

$0.25/M

Output

$0.75/M

Context

128K

OLMo 2 7B

Allen AI

Outperforms Llama 3.1 8B. Everything released: training data, weights, code, recipes, and checkpoints.

Input

$0.07/M

Output

$0.14/M

Context

128K

Molmo 72B

Allen AI

Open multimodal model for visual understanding, image captioning, and visual question answering.

Input

$0.40/M

Output

$1.20/M

Context

128K

GLM-4.5V

Zhipu AI

Vision-language MoE model with superior performance at lower inference cost.

Input

$0.15/M

Output

$0.30/M

Context

128K

CogVideoX 5B

Zhipu AI

Open-source video generation model creating 6-second clips at 720x480. Supports LoRA fine-tuning.

Input

$0.05/M

Output

$0.05/M

BGE-VL

BAAI

State-of-the-art multimodal embedding model for visual search applications.

Input

$0.02/M

Output

$0.02/M

Context

LTX-2

Lightricks

Lightricks' open-source video generation model capable of producing native 4K video at 50 frames per second with clips up to 20 seconds in length. LTX-2 includes native audio synthesis and offers full model weights under a permissive license, making it a leading choice for researchers and developers building custom video generation pipelines.

videoaudio

Input

Free/M

Output

Free/M

Wan 2.1

Alibaba/Qwen

Alibaba's open-source video generation model that achieved the number one ranking on the VBench video quality benchmark upon release. With 14 billion parameters, Wan 2.1 demonstrates exceptional prompt adherence, temporal consistency, and visual quality across diverse content types, establishing a new baseline for open-weight video synthesis models.

Input

Free/M

Output

Free/M

Wan 2.2

Alibaba/Qwen

The successor to Wan 2.1, this open-source model introduces a Mixture-of-Experts flow-matching architecture with approximately 27 billion total parameters and 14 billion active during inference. Wan 2.2 delivers significantly improved motion quality, fine-grained detail, and extended generation lengths while maintaining the accessibility of fully open weights.

Input

Free/M

Output

Free/M

HunyuanVideo 1.5

Tencent

Tencent's open-source video generation model with 8.3 billion parameters, featuring a novel Spatial-Temporal Self-Attention (SSTA) mechanism for improved temporal coherence. HunyuanVideo 1.5 supports diverse aspect ratios, variable frame rates, and extended clip durations, making it a versatile foundation model for the open-source video generation community.

Input

Free/M

Output

Free/M

FLUX.2 Dev

Black Forest Labs

The open-weights development version of FLUX.2 with the same 32 billion parameter architecture as the Pro variant, released for non-commercial research and experimentation. FLUX.2 Dev provides researchers full access to model weights for fine-tuning, distillation, and architectural exploration while delivering near-Pro-level quality for academic and personal projects.

Input

Free/M

Output

Stability AI

Lightweight language model for on-device inference and resource-constrained environments.

Input

$0.02/M

Output

$0.04/M

Context

SD 3.5 Medium

Stability AI

Mid-size Stable Diffusion optimized for consumer GPUs and edge devices.

Input

$0.02/M

Output

$0.02/M

FLUX.2 Klein 4B

Black Forest Labs

Fastest FLUX model generating and editing images in under one second. Fully open under Apache 2.0.

Input

$0.01/M

Output

$0.01/M

FLUX.1 Schnell

Black Forest Labs

Fast open-source text-to-image model with 4-step generation. Apache 2.0 licensed.

Input

$0.02/M

Output

$0.02/M

Llama Guard 3 8B

Meta

Safety classification model for detecting unsafe content in LLM inputs and outputs.

Input

$0.05/M

Output

$0.05/M

Context

128K

Gemma 2 2B

Google

Smallest Gemma 2 model for efficient text processing on consumer hardware.

Input

$0.02/M

Output

$0.04/M

Context

EXAONE 4.0 32B

LG AI Research

Korean sovereign AI model using MoE with hybrid attention for reduced computation.

Input

$0.25/M

Output

$0.75/M

Context

128K

EXAONE 4.0 1.2B

LG AI Research

Ultra-compact Korean AI model for on-device and mobile deployment.

Input

$0.02/M

Output

$0.04/M

Context

128K

Solar Pro 2

Upstage

Agentic reasoning-focused model matching larger rivals. Strong multilingual capabilities.

Input

$0.20/M

Output

$0.60/M

Context

128K

BGE-M3

BAAI

Most popular open embedding model. Multi-functionality, multi-linguality, multi-granularity in one model.

Input

$0.01/M

Output

$0.01/M

Context

Whisper Large V3

OpenAI

Gold standard speech recognition model supporting 99+ languages. 1.55B parameter encoder-decoder architecture.

audio

Input

$0.0060/M

Output

$0.0060/M

Whisper Large V3 Turbo

OpenAI

Speed-optimized Whisper variant with 6x faster inference at 809M parameters.

audio

Input

$0.0030/M

Output

$0.0030/M

Gemma 3 1B

Google

Smallest Gemma 3 model for edge and mobile deployment. Text-only with 128K context.

Input

$0.02/M

Output

$0.02/M

Context

128K

PaliGemma2 28B

Google

Open vision-language model for image captioning, visual QA, and OCR tasks. Built on Gemma 2 backbone.

Input

$0.30/M

Output

$0.60/M

Context

NVIDIA

Speed-optimized ASR model delivering 1000+ RTFx on Open ASR Leaderboard. Exceptional accuracy.

audio

Input

$0.0040/M

Output

$0.0040/M

Aya Expanse 8B

Cohere

Multilingual model covering 23 languages for global enterprise deployment.

Input

$0.05/M

Output

$0.15/M

Context

128K

Falcon 3 10B

TII

Outperforms all models under 13B on HuggingFace leaderboard. Trained on 14T tokens with innovative 1.58-bit quantized variant.

Input

$0.10/M

Output

$0.30/M

Context

32K

Falcon 3 7B

TII

Versatile 7B model with 30 checkpoint variants including base, instruct, and quantized.

Input

$0.07/M

Output

$0.21/M

Context

32K

Nomic Embed V2

Nomic AI

First MoE embedding model. Trained on 1.6B pairs across ~100 languages with top-2 expert routing.

Input

$0.01/M

Output

$0.01/M

Context

Jina Embeddings V4

Jina AI

Universal multimodal embedding handling text, images, and documents in 30+ languages.

Input

$0.02/M

Output

$0.02/M

Context

Mochi 1

Genmo

High-performance open text-to-video model excelling in text consistency.

Input

$0.05/M

Output

$0.05/M

Granite 3.1 8B

IBM

Enterprise-grade model with strong instruction following for business applications.

Input

$0.10/M

Output

$0.20/M

Context

128K

Granite 3.1 2B

IBM

Compact enterprise model for edge deployment and lightweight business tasks.

Input

$0.03/M

Output

$0.06/M

Context

128K

Granite 3.2 8B

IBM

Updated Granite with enhanced coding and tool-use capabilities for enterprise automation.

Input

$0.10/M

Output

$0.20/M

Context

128K

Granite 3.2 2B

IBM

Small enterprise model with coding support for lightweight automation workflows.

Input

$0.03/M

Output

$0.06/M

Context

128K

MiniCPM-V 2.6

OpenBMB

Efficient vision-language model rivaling GPT-4V quality at a fraction of the size.

Input

$0.10/M

Output

$0.20/M

Context

128K

SmolLM2 1.7B

Hugging Face

Compact LLM designed for on-device AI. Surprisingly capable for its tiny size.

Input

$0.01/M

Output

$0.02/M

Context

Google

Google's previous-gen open-source model with strong general capabilities.

Input

$0.07/M

Output

$0.07/M

Context

Gemma 2 9B

Google

Efficient open-source model from Google. Great performance-to-size ratio.

Input

$0.03/M

Output

$0.03/M

Context

CodeGemma 7B

Google

Google's open-source code-focused model based on the Gemma architecture.

Input

$0.03/M

Output

$0.03/M

Context

Llama 3.2 90B Vision

Meta

Meta's largest multimodal Llama model with image understanding capabilities.

Input

$0.35/M

Output

$0.40/M

Context

128K

Llama 3.2 11B Vision

Meta

Efficient multimodal Llama model for image + text tasks at 11B parameters.