GPT-5-Nano is the smallest and fastest variant in the GPT-5 system, optimized for developer tools, rapid interactions, and ultra-low latency environments. While limited in reasoning depth compared to its larger...

textimagefile 400K ctx

View profile

Gemini 2.5 Flash Lite

Google

Free

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance...

textimagefile 1049K ctx

View profile

Qwen3.5-9B

Alibaba

Free

Qwen3.5-9B is a multimodal foundation model from the Qwen3.5 family, designed to deliver strong reasoning, coding, and visual understanding in an efficient 9B-parameter architecture. It uses a unified vision-language design...

textimagevideo 262K ctx

View profile

Gemma 4 31B

Google

Free

Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output. Features a 256K token context window, configurable thinking/reasoning mode, native function...

textimagevideo 262K ctx

View profile

DeepSeek V4 Flash

DeepSeek

Free

DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-token context window. It is designed for fast inference and...

text 1049K ctx

View profile

Gemma 3 27B

Google

Free

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities,...

textimage 131K ctx

View profile

Qwen3.5-Flash

Alibaba

Free

The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. Compared to the...

textimagevideo 1000K ctx

View profile

Gemma 3 4B

Google

Free

textimage 131K ctx

View profile

Gemma 4 26B A4B

Google

Free

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token during inference — delivering near-31B quality at...

textimagevideo 262K ctx

View profile

GPT-4o-mini

OpenAI

Free

GPT-4o mini is OpenAI's newest model after [GPT-4 Omni](/models/openai/gpt-4o), supporting both text and image inputs with text outputs. As their most advanced small model, it is many multiples more affordable...

textimagefile 128K ctx

View profile

GPT-4.1 Nano

OpenAI

Free

For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million...

textimagefile 1048K ctx

View profile

Gemma 3n 4B

Google

Free

Gemma 3n E4B-it is optimized for efficient execution on mobile and low-resource devices, such as phones, laptops, and tablets. It supports multimodal inputs—including text, visual data, and audio—enabling diverse tasks...

text 33K ctx

View profile

GLM 4.7 Flash

Z-ai

Free

As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances performance and efficiency. It is further optimized for agentic coding use cases, strengthening coding capabilities, long-horizon task planning,...

text 203K ctx

View profile

Llama 4 Scout

Voxtral Small 24B 2507

Mistral AI

Free

Voxtral Small is an enhancement of Mistral Small 3, incorporating state-of-the-art audio input capabilities while retaining best-in-class text performance. It excels at speech transcription, translation and audio understanding. Input audio...

textfileaudio 32K ctx

View profile