Explore 66+ AI models from 49 providers. Filter by capability, tier, and pricing to find the right model.
66 results
OpenAI
A smaller, faster, and more affordable version of GPT-4o. Great for lightweight tasks.
Input
$0.15/M
Output
$0.60/M
Context
128K
Anthropic
Anthropic's fastest and most affordable model. Great for high-volume, low-latency tasks.
Input
$0.80/M
Output
$4.00/M
Context
200K
Mistral AI
Mistral's efficient model for everyday tasks. Fast and cost-effective.
Input
$0.10/M
Output
$0.30/M
Context
32K
OpenAI
A fast, affordable variant of GPT-4.1 for high-volume workloads.
Input
$0.40/M
Output
$1.60/M
Context
1.0M
OpenAI
OpenAI's fastest and cheapest model. Ideal for classification, autocomplete, and high-throughput tasks.
Input
$0.10/M
Output
$0.40/M
Context
1.0M
Efficient open-source model from Google with multimodal capabilities at 12B parameters.
Input
$0.05/M
Output
$0.05/M
Context
128K
Ultra-efficient open-source model from Google. Runs on mobile and edge devices.
Input
$0.02/M
Output
$0.02/M
Context
128K
Efficient open-source model from Google. Great performance-to-size ratio.
Input
$0.03/M
Output
$0.03/M
Context
8K
Google's open-source code-focused model based on the Gemma architecture.
Input
$0.03/M
Output
$0.03/M
Context
8K
Meta
Efficient multimodal Llama model for image + text tasks at 11B parameters.
Input
$0.06/M
Output
$0.06/M
Context
128K
Meta
Ultra-lightweight Llama model for edge deployment and mobile applications.
Input
$0.01/M
Output
$0.01/M
Context
128K
Meta
The smallest Llama model for on-device inference and constrained environments.
Input
$0.01/M
Output
$0.01/M
Context
128K
Meta
Meta's efficient open-source base model. Excellent for fine-tuning and custom deployments.
Input
$0.05/M
Output
$0.05/M
Context
128K
Mistral AI
Mistral's 12B open-source model co-developed with NVIDIA. Replaces Mistral 7B.
Input
$0.04/M
Output
$0.04/M
Context
128K
Mistral AI
Mistral's open-source multimodal model. Processes images natively alongside text.
Input
$0.10/M
Output
$0.10/M
Context
128K
Mistral AI
The original open-source MoE model that started the MoE trend. Fast and efficient.
Input
$0.24/M
Output
$0.24/M
Context
32K
Mistral AI
Mistral's edge-optimized model with a knowledge-dense 8B parameter design.
Input
$0.10/M
Output
$0.10/M
Context
128K
Mistral AI
The model that launched Mistral. Open-source, fast, and surprisingly capable for 7B.
Input
$0.06/M
Output
$0.06/M
Context
32K
Alibaba/Qwen
Efficient open-source model balancing capability and speed.
Input
$0.05/M
Output
$0.05/M
Context
128K
Alibaba/Qwen
Compact open-source model for edge deployment and fine-tuning.
Input
$0.03/M
Output
$0.03/M
Context
32K
Alibaba/Qwen
Compact open-source coding model with impressive code generation capabilities.
Input
$0.03/M
Output
$0.03/M
Context
128K
Microsoft
Microsoft's 14B open-source model with training innovations that punch above its weight class.
Input
$0.04/M
Output
$0.04/M
Context
16K
Microsoft
Microsoft's compact open-source model with 128K context. Great for on-device inference.
Input
$0.01/M
Output
$0.01/M
Context
128K
Microsoft
Microsoft's 14B open-source model with 128K context and strong reasoning capabilities.
Input
$0.04/M
Output
$0.04/M
Context
128K
AI21 Labs
Compact version of Jamba with hybrid SSM-Transformer architecture.
Input
$0.20/M
Output
$0.40/M
Context
256K
TII
TII's efficient open-source model with multimodal capabilities.
Input
$0.04/M
Output
$0.04/M
Context
8K
Stability AI
Stability AI's open-source language model with multilingual support.
Input
$0.04/M
Output
$0.04/M
Context
4K
Allen AI
Fully open-source model from Allen AI with open training data, code, and weights.
Input
$0.04/M
Output
$0.04/M
Context
4K
BigCode
Open-source code model from BigCode/HuggingFace trained on The Stack v2.
Input
$0.04/M
Output
$0.04/M
Context
16K
Google's ultra-efficient model offering better performance than Gemini 1.5 Flash at the same cost point.
Input
$0.07/M
Output
$0.30/M
Context
1.0M
Mistral AI
Compact 24B model with image understanding, 128K context, and Apache 2.0 license.
Input
$0.10/M
Output
$0.30/M
Context
128K
Microsoft
Microsoft's 3.8B parameter model with 128K context, strong reasoning capability for on-device deployment.
Input
$0.01/M
Output
$0.01/M
Context
128K
Microsoft
Microsoft's 5.6B compact model unifying text, vision, and speech in a single architecture.
Input
$0.02/M
Output
$0.02/M
Context
128K
Alibaba/Qwen
Compact 8B model from the Qwen3 family with thinking mode support and strong efficiency for on-device use.
Input
$0.03/M
Output
$0.06/M
Context
131K
Alibaba/Qwen
Ultra-efficient MoE model with 128 experts and only 3.3B active parameters, ideal for cost-sensitive deployments.
Input
$0.02/M
Output
$0.04/M
Context
131K
Cohere
Cohere's compact multilingual model supporting 70+ languages. Runs on consumer devices including phones. Outperforms Gemma3-4B in 46/61 languages.
Input
$0.01/M
Output
$0.01/M
Context
32K
TII
Compact Falcon model for resource-constrained deployments with strong reasoning.
Input
$0.04/M
Output
$0.08/M
Context
32K
TII
Smallest Falcon model for edge inference and mobile deployment.
Input
$0.02/M
Output
$0.04/M
Context
32K
Allen AI
Outperforms Llama 3.1 8B. Everything released: training data, weights, code, recipes, and checkpoints.
Input
$0.07/M
Output
$0.14/M
Context
128K
OpenAI
A cost-efficient variant of OpenAI's image generation model offering 54-70% lower pricing while retaining strong prompt adherence and visual quality for standard use cases. GPT Image 1 Mini is optimized for high-volume applications such as e-commerce product imagery, social media content, and rapid prototyping where speed and cost matter more than maximum fidelity.
Input
$2.50/M
Output
$8.00/M
Shanghai AI Lab
Latest InternLM series model. Efficient for research and application development.
Input
$0.07/M
Output
$0.14/M
Context
128K
01.AI
Compact Yi model offering strong reasoning at minimal resource requirements.
Input
$0.06/M
Output
$0.12/M
Context
128K
BigCode
Compact code model trained on 4T+ tokens and 600+ languages from The Stack v2.
Input
$0.03/M
Output
$0.06/M
Context
16K
Stability AI
Lightweight language model for on-device inference and resource-constrained environments.
Input
$0.02/M
Output
$0.04/M
Context
4K
Black Forest Labs
Fastest FLUX model generating and editing images in under one second. Fully open under Apache 2.0.
Input
$0.01/M
Output
$0.01/M
Smallest Gemma 2 model for efficient text processing on consumer hardware.
Input
$0.02/M
Output
$0.04/M
Context
8K
LG AI Research
Ultra-compact Korean AI model for on-device and mobile deployment.
Input
$0.02/M
Output
$0.04/M
Context
128K
OpenAI
Speed-optimized Whisper variant with 6x faster inference at 809M parameters.
Input
$0.0030/M
Output
$0.0030/M
Anthropic
Fastest and most cost-efficient Claude model designed for high-throughput, low-latency applications.
Input
$0.80/M
Output
$4.00/M
Context
200K
Smallest Gemma 3 model for edge and mobile deployment. Text-only with 128K context.
Input
$0.02/M
Output
$0.02/M
Context
128K
Alibaba/Qwen
Compact Qwen3 model with hybrid reasoning for edge deployment and resource-constrained environments.
Input
$0.05/M
Output
$0.15/M
Context
128K
Alibaba/Qwen
Lightweight Qwen3 model for on-device AI applications with reasoning capability.
Input
$0.02/M
Output
$0.06/M
Context
128K
Alibaba/Qwen
Smallest Qwen3 model designed for ultra-lightweight deployment and edge inference.
Input
$0.01/M
Output
$0.03/M
Context
32K
Alibaba/Qwen
Smallest Qwen VL model for lightweight vision-language tasks on constrained hardware.
Input
$0.04/M
Output
$0.12/M
Context
128K
DeepSeek
Distilled R1 reasoning into compact Qwen-based model. Exceptional at math and programming.
Input
$0.07/M
Output
$0.14/M
Context
128K
DeepSeek
R1 reasoning distilled into Llama 3 architecture. Strong reasoning at minimal compute cost.
Input
$0.07/M
Output
$0.14/M
Context
128K
Mistral AI
Smallest Mistral model for edge computing and extremely resource-constrained deployments.
Input
$0.04/M
Output
$0.10/M
Context
128K
Microsoft
Enhanced reasoning model using 1.5x more tokens for higher accuracy on complex logical tasks.
Input
$0.07/M
Output
$0.14/M
Context
32K
NVIDIA
Hybrid Mamba-Transformer MoE with 4x higher throughput than predecessor. Open weights and training data.
Input
$0.04/M
Output
$0.08/M
Context
1.0M
NVIDIA
Speed-optimized ASR model delivering 1000+ RTFx on Open ASR Leaderboard. Exceptional accuracy.
Input
$0.0040/M
Output
$0.0040/M
Apple
On-device model optimized for Apple silicon with 2-bit quantization. Powers Siri and Apple Intelligence.
Input
Free/M
Output
Free/M
Context
4K
IBM
Compact enterprise model for edge deployment and lightweight business tasks.
Input
$0.03/M
Output
$0.06/M
Context
128K
IBM
Small enterprise model with coding support for lightweight automation workflows.
Input
$0.03/M
Output
$0.06/M
Context
128K
Hugging Face
Compact LLM designed for on-device AI. Surprisingly capable for its tiny size.
Input
$0.01/M
Output
$0.02/M
Context
8K
Hugging Face
Tiny but functional language model for extreme resource constraints and research.
Input
$0.0050/M
Output
$0.01/M
Context
8K
Cohere
Cohere's smallest Command model optimized for RAG, tool use, and multilingual enterprise applications.
Input
$0.04/M
Output
$0.08/M
Context
128K