OtherArena #167Jun 26, 2025
Inception: Mercury
Mercury is the first diffusion large language model (dLLM). Applying a breakthrough discrete diffusion approach, the model runs 5-10x faster than even speed optimized models like GPT-4.1 Nano and Claude 3.5 Haiku while matching their performance. Mercury's speed enables developers to provide responsive user experiences, including with voice agents, search interfaces, and chatbots. Read more in the [blog post] (https://www.inceptionlabs.ai/blog/introducing-mercury) here.
Context Window
128K
tokens
Max Output
16K
tokens
Released
Jun 26, 2025
Arena Rank
#167
of 305 models
Capabilities
👁Vision
🧠Reasoning
🔧Tool Calling
⚡Prompt Caching
🖥Computer Use
🎨Image Generation
Supported Parameters
Frequency Penalty
Reduce repetition
Max Tokens
Output length limit
Presence Penalty
Encourage new topics
Response Format
JSON mode / structured output
Stop Sequences
Custom stop tokens
structured_outputs
Temperature
Controls randomness
Tool Choice
Control tool usage
Tool Calling
Function calling support
Top K
Top-K sampling
Top P
Nucleus sampling
Pricing Comparison
| Router | Input / 1M | Output / 1M | Cached Input / 1M |
|---|---|---|---|
| OpenRouter | $0.25 | $1.00 | — |
| Martian | $0.25 | $1.00 | — |
Model IDs
OpenRouter
inception/mercuryTags
tool-calling
Related Models
Other#66
Meituan: LongCat Flash Chat
131K ctxFree/1M in
Other#72
Xiaomi: MiMo-V2-Flash
262K ctx$0.09/1M in
Other#106
Prime Intellect: INTELLECT-3
131K ctx$0.20/1M in
OpenAI#108
OpenAI: gpt-oss-120b (free)
131K ctxFree/1M in
Other#139
AllenAI: Olmo 3.1 32B Instruct
66K ctx$0.20/1M in
OpenAI#159
OpenAI: gpt-oss-20b (free)
131K ctxFree/1M in