LLM Router
HomeRoutersModelsProvidersBenchmarksPricingCompareBlogAbout
HomeModelsBenchmarksPricingCompareBlog
LLM Router

Independent comparison platform for LLM routing infrastructure.

Platform

  • Home
  • Routers
  • Models
  • Pricing
  • Blog
  • About

Routers

  • Requesty
  • OpenRouter
  • Martian
  • Unify
  • LiteLLM

© 2026 LLM Router

Data from public sources. May not reflect real-time pricing.

Providers›MiniMax

MiniMax

↗ Website

MiniMax is a Chinese AI company developing large language models and multimodal AI systems. Their models offer strong performance with competitive pricing, making advanced AI accessible for diverse applications.

Pricing available from Requesty, OpenRouter, Vercel AI, Martian, DeepInfra.

Total Models
7
Arena Ranked
3
of 7
Open Source
7
of 7
Cheapest Input
$0.20
per 1M tokens

$ Pricing Summary(per 1M tokens)

MetricInputOutput
Cheapest$0.20$0.95
Average$0.30$1.31
Most Expensive$0.40$2.40

⚙ Capabilities

👁
Vision
2
of 7 models
🧠
Reasoning
4
of 7 models
🔧
Tool Calling
5
of 7 models
⚡
Prompt Caching
3
of 7 models
🖥
Computer Use
0
of 7 models
🎨
Image Generation
0
of 7 models

🤖 All MiniMax Models(7)

MiniMaxMiniMaxOSS
#82

MiniMax: MiniMax M2.1

MiniMax-M2.1 is a lightweight, state-of-the-art large language model optimized for coding, agentic workflows, and modern application development. With only 10 billion activated parameters, it delivers a major jump in real-world capability while maintaining exceptional latency, scalability, and cost efficiency. Compared to its predecessor, M2.1 delivers cleaner, more concise outputs and faster perceived response times. It shows leading multilingual coding performance across major systems and application languages, achieving 49.4% on Multi-SWE-Bench and 72.5% on SWE-Bench Multilingual, and serves as a versatile agent “brain” for IDEs, coding tools, and general-purpose assistance. To avoid degrading this model's performance, MiniMax highly recommends preserving reasoning between turns. Learn more about using reasoning_details to pass back reasoning in our [docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#preserving-reasoning-blocks).

Context
197K
Max Output
—
Input/1M
$0.27
🧠 Reasoning🔧 Tools⚡ Cache
Pricing (per 1M tokens)
OpenRouter$0.27 / $0.95
Vercel AI$0.30 / $1.20
Martian$0.27 / $0.95
DeepInfra$0.27 / $0.95
2025-12-23View details →
MiniMaxMiniMaxOSS
#97

MiniMax: MiniMax M1

MiniMax-M1 is a large-scale, open-weight reasoning model designed for extended context and high-efficiency inference. It leverages a hybrid Mixture-of-Experts (MoE) architecture paired with a custom "lightning attention" mechanism, allowing it to process long sequences—up to 1 million tokens—while maintaining competitive FLOP efficiency. With 456 billion total parameters and 45.9B active per token, this variant is optimized for complex, multi-step reasoning tasks. Trained via a custom reinforcement learning pipeline (CISPO), M1 excels in long-context understanding, software engineering, agentic tool use, and mathematical reasoning. Benchmarks show strong performance across FullStackBench, SWE-bench, MATH, GPQA, and TAU-Bench, often outperforming other open models like DeepSeek R1 and Qwen3-235B.

Context
1.0M
Max Output
40K
Input/1M
$0.40
🧠 Reasoning🔧 Tools
Pricing (per 1M tokens)
OpenRouter$0.40 / $2.20
Martian$0.40 / $2.20
2025-06-17View details →
MiniMaxMiniMaxOSS
#118

MiniMax M2

MiniMax-M2 is a compact, high-efficiency large language model optimized for end-to-end coding and agentic workflows. With 10 billion activated parameters (230 billion total), it delivers near-frontier intelligence across general reasoning, tool use, and multi-step task execution while maintaining low latency and deployment efficiency. The model excels in code generation, multi-file editing, compile-run-fix loops, and test-validated repair, showing strong results on SWE-Bench Verified, Multi-SWE-Bench, and Terminal-Bench. It also performs competitively in agentic evaluations such as BrowseComp and GAIA, effectively handling long-horizon planning, retrieval, and recovery from execution errors. Benchmarked by [Artificial Analysis](https://artificialanalysis.ai/models/minimax-m2), MiniMax-M2 ranks among the top open-source models for composite intelligence, spanning mathematics, science, and instruction-following. Its small activation footprint enables fast inference, high concurrency, and improved unit economics, making it well-suited for large-scale agents, developer assistants, and reasoning-driven applications that require responsiveness and cost efficiency. To avoid degrading this model's performance, MiniMax highly recommends preserving reasoning between turns. Learn more about using reasoning_details to pass back reasoning in our [docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#preserving-reasoning-blocks).

Context
200K
Max Output
128K
Input/1M
$0.26
🔧 Tools
Pricing (per 1M tokens)
Requesty★$0.30 / $1.20
OpenRouter$0.26 / $1.00
Vercel AI$0.30 / $1.20
Martian$0.26 / $1.00
2025-10-23View details →
MiniMaxMiniMaxOSS

MiniMax M2.5

MiniMax-M2.5 is a SOTA large language model designed for real-world productivity. Trained in a diverse range of complex real-world digital working environments, M2.5 builds upon the coding expertise of M2.1 to extend into general office work, reaching fluency in generating and operating Word, Excel, and Powerpoint files, context switching between diverse software environments, and working across different agent and human teams. Scoring 80.2% on SWE-Bench Verified, 51.3% on Multi-SWE-Bench, and 76.3% on BrowseComp, M2.5 is also more token efficient than previous generations, having been trained to optimize its actions and output through planning.

Context
205K
Max Output
131K
Input/1M
$0.30
👁 Vision🧠 Reasoning🔧 Tools⚡ Cache
Pricing (per 1M tokens)
Requesty★$0.30 / $1.20
OpenRouter$0.30 / $1.20
Vercel AI$0.30 / $1.20
2026-02-12View details →
MiniMaxMiniMaxOSS

MiniMax: MiniMax M2-her

MiniMax M2-her is a dialogue-first large language model built for immersive roleplay, character-driven chat, and expressive multi-turn conversations. Designed to stay consistent in tone and personality, it supports rich message roles (user_system, group, sample_message_user, sample_message_ai) and can learn from example dialogue to better match the style and pacing of your scenario, making it a strong choice for storytelling, companions, and conversational experiences where natural flow and vivid interaction matter most.

Context
66K
Max Output
2K
Input/1M
$0.30
⚡ Cache
Pricing (per 1M tokens)
OpenRouter$0.30 / $1.20
Martian$0.30 / $1.20
2026-01-23View details →
MiniMaxMiniMaxOSS

MiniMax M2.1 Lightning

MiniMax-M2.1-lightning is a faster version of MiniMax-M2.1, offering the same performance but with significantly higher throughput (output speed ~100 TPS, MiniMax-M2 output speed ~60 TPS).

Context
205K
Max Output
131K
Input/1M
$0.30
🧠 Reasoning🔧 Tools
Pricing (per 1M tokens)
Vercel AI$0.30 / $2.40
2025-10-27View details →
MiniMaxMiniMaxOSS

MiniMax: MiniMax-01

MiniMax-01 is a combines MiniMax-Text-01 for text generation and MiniMax-VL-01 for image understanding. It has 456 billion parameters, with 45.9 billion parameters activated per inference, and can handle a context of up to 4 million tokens. The text model adopts a hybrid architecture that combines Lightning Attention, Softmax Attention, and Mixture-of-Experts (MoE). The image model adopts the “ViT-MLP-LLM” framework and is trained on top of the text model. To read more about the release, see: https://www.minimaxi.com/en/news/minimax-01-series-2

Context
1.0M
Max Output
1.0M
Input/1M
$0.20
👁 Vision
Pricing (per 1M tokens)
OpenRouter$0.20 / $1.10
2025-01-15View details →
← Back to all providers