OtherArena #167Jun 26, 2025

Inception: Mercury

Mercury is the first diffusion large language model (dLLM). Applying a breakthrough discrete diffusion approach, the model runs 5-10x faster than even speed optimized models like GPT-4.1 Nano and Claude 3.5 Haiku while matching their performance. Mercury's speed enables developers to provide responsive user experiences, including with voice agents, search interfaces, and chatbots. Read more in the [blog post] (https://www.inceptionlabs.ai/blog/introducing-mercury) here.

Context Window
128K
tokens
Max Output
16K
tokens
Released
Jun 26, 2025
Arena Rank
#167
of 305 models

Capabilities

👁Vision
🧠Reasoning
🔧Tool Calling
Prompt Caching
🖥Computer Use
🎨Image Generation

Supported Parameters

Frequency Penalty
Reduce repetition
Max Tokens
Output length limit
Presence Penalty
Encourage new topics
Response Format
JSON mode / structured output
Stop Sequences
Custom stop tokens
structured_outputs
Temperature
Controls randomness
Tool Choice
Control tool usage
Tool Calling
Function calling support
Top K
Top-K sampling
Top P
Nucleus sampling

Pricing Comparison

RouterInput / 1MOutput / 1MCached Input / 1M
OpenRouter$0.25$1.00
Martian$0.25$1.00

Model IDs

OpenRouterinception/mercury

Tags

tool-calling
Compare with another model

Related Models