Model Hub 24 models

Fast AI Models

Explore faster AI models for quick drafts, lightweight coding help, everyday chat, and high-volume workflows.

Search intent

Users who care about speed, latency, and lightweight everyday work.

Tool

AI Email Writer

Use fast models for short replies and quick edits.

Bot

AI Meeting Summarizer Bot

Use faster models for quick transcript cleanup.

Comparison

GPT-5 vs Gemini

Compare model behavior for speed-sensitive work.

Recommended Models

All Models

Gemini 2.5 Pro Preview 06-05

Google

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through responses with enhanced accuracy...

textimagefile 1049K ctx

View profile

Gemini 2.5 Flash

Google

Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for advanced reasoning, coding, mathematics, and scientific tasks. It includes built-in "thinking" capabilities, enabling it to provide responses with greater...

textimagefile 1049K ctx

View profile

Gemini 3.5 Flash

Google

Gemini 3.5 Flash is Google's high-efficiency multimodal model, bringing near-Pro level coding and reasoning at Flash-tier cost and speed. It is highly optimized for coding proficiency and parallel agentic execution...

textimagefile 1049K ctx

View profile

GPT-5 Nano

OpenAI

Free

GPT-5-Nano is the smallest and fastest variant in the GPT-5 system, optimized for developer tools, rapid interactions, and ultra-low latency environments. While limited in reasoning depth compared to its larger...

textimagefile 400K ctx

View profile

GPT-5.4 Mini

OpenAI

GPT-5.4 mini brings the core capabilities of GPT-5.4 to a faster, more efficient model optimized for high-throughput workloads. It supports text and image inputs with strong performance across reasoning, coding,...

textimagefile 400K ctx

View profile

Gemini 2.5 Flash Lite

Google

Free

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance...

textimagefile 1049K ctx

View profile

Gemini 3 Flash Preview

Google

Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic workflows, multi turn chat, and coding assistance. It delivers near Pro level reasoning and tool...

textimagefile 1049K ctx

View profile

Gemini 3.1 Pro Preview

Google

Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more efficient token usage across complex workflows. Building on the multimodal foundation...

textimagefile 1049K ctx

View profile

Claude 3.5 Haiku

Anthropic

Claude 3.5 Haiku features offers enhanced capabilities in speed, coding accuracy, and tool use. Engineered to excel in real-time applications, it delivers quick response times that are essential for dynamic...

textimage 200K ctx

View profile

Claude Opus 4.6 (Fast)

Anthropic

Fast-mode variant of [Opus 4.6](/anthropic/claude-opus-4.6) - identical capabilities with higher output speed at premium 6x pricing. Learn more in Anthropic's docs: https://platform.claude.com/docs/en/build-with-claude/fast-mode

textimagefile 1000K ctx

View profile

Claude Opus 4.7 (Fast)

Anthropic

Fast-mode variant of [Opus 4.7](/anthropic/claude-opus-4.7) - identical capabilities with higher output speed at premium 6x pricing. Learn more in Anthropic's docs: https://platform.claude.com/docs/en/build-with-claude/fast-mode

textimagefile 1000K ctx

View profile

Claude Haiku 4.5

Anthropic

Claude Haiku 4.5 is Anthropic’s fastest and most efficient model, delivering near-frontier intelligence at a fraction of the cost and latency of larger Claude models. Matching Claude Sonnet 4’s performance...

textimagefile 200K ctx

View profile

Gemini 3.1 Flash Lite

Google

Gemini 3.1 Flash Lite is Google’s GA high-efficiency multimodal model optimized for low-latency, high-volume workloads. It supports text, image, video, audio, and PDF inputs, and is designed for lightweight agentic...

textimagefile 1049K ctx

View profile

Gemini 3.1 Flash Lite Preview

Google

Gemini 3.1 Flash Lite Preview is Google's high-efficiency model optimized for high-volume use cases. It outperforms Gemini 2.5 Flash Lite on overall quality and approaches Gemini 2.5 Flash performance across...

textimagefile 1049K ctx

View profile

Gemini 3.1 Pro Preview Custom Tools

Google

Gemini 3.1 Pro Preview Custom Tools is a variant of Gemini 3.1 Pro that improves tool selection behavior by preventing overuse of a general bash tool when more efficient third-party...

textimagefile 1049K ctx

View profile

GPT-5 Image Mini

OpenAI

GPT-5 Image Mini combines OpenAI's advanced language capabilities, powered by [GPT-5 Mini](https://openrouter.ai/openai/gpt-5-mini), with GPT Image 1 Mini for efficient image generation. This natively multimodal model features superior instruction following, text...

textimagefile 400K ctx

View profile

GPT-5 Mini

OpenAI

GPT-5 Mini is a compact version of GPT-5, designed to handle lighter-weight reasoning tasks. It provides the same instruction-following and safety-tuning benefits as GPT-5, but with reduced latency and cost....

textimagefile 400K ctx

View profile

GPT-5.4 Nano

OpenAI

GPT-5.4 nano is the most lightweight and cost-efficient variant of the GPT-5.4 family, optimized for speed-critical and high-volume tasks. It supports text and image inputs and is designed for low-latency...

textimagefile 400K ctx

View profile

DeepSeek V4 Flash

DeepSeek

Free

DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-token context window. It is designed for fast inference and...

text 1049K ctx

View profile

o4 Mini Deep Research

OpenAI

o4-mini-deep-research is OpenAI's faster, more affordable deep research model—ideal for tackling complex, multi-step research tasks. Note: This model always uses the 'web_search' tool which adds additional cost.

textimagefile 200K ctx

View profile

Qwen3.5-Flash

Alibaba

Free

The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. Compared to the...

textimagevideo 1000K ctx

View profile

GPT-4o-mini

OpenAI

Free

GPT-4o mini is OpenAI's newest model after [GPT-4 Omni](/models/openai/gpt-4o), supporting both text and image inputs with text outputs. As their most advanced small model, it is many multiples more affordable...

textimagefile 128K ctx

View profile

GPT Audio Mini

OpenAI

A cost-efficient version of GPT Audio. The new snapshot features an upgraded decoder for more natural sounding voices and maintains better voice consistency. Input is priced at $0.60 per million...

textaudio 128K ctx

View profile

Nano Banana 2 (Gemini 3.1 Flash Image Preview)

Google

Gemini 3.1 Flash Image Preview, a.k.a. "Nano Banana 2," is Google’s latest state of the art image generation and editing model, delivering Pro-level visual quality at Flash speed. It combines...

textimage 131K ctx

View profile