AnthropicClaude 4Arena #53May 22, 2025
Claude Opus 4
Claude Opus 4 is benchmarked as the world’s best coding model, at time of release, bringing sustained performance on complex, long-running tasks and agent workflows. It sets new benchmarks in software engineering, achieving leading results on SWE-bench (72.5%) and Terminal-bench (43.2%). Opus 4 supports extended, agentic workflows, handling thousands of task steps continuously for hours without degradation. Read more at the blog post here
Context Window
200K
tokens
Max Output
32K
tokens
Released
May 22, 2025
Arena Rank
#53
of 305 models
Output Speed
68
tokens/sec
Time to First Token
1.8s
TTFT
Capabilities
👁Vision
🧠Reasoning
🔧Tool Calling
⚡Prompt Caching
🖥Computer Use
🎨Image Generation
Supported Parameters
Include Reasoning
Show reasoning tokens
Max Tokens
Output length limit
Reasoning
Extended thinking
Stop Sequences
Custom stop tokens
Temperature
Controls randomness
Tool Choice
Control tool usage
Tool Calling
Function calling support
Top K
Top-K sampling
Top P
Nucleus sampling
Pricing Comparison
| Router | Input / 1M | Output / 1M | Cached Input / 1M |
|---|---|---|---|
| Requesty★ | $15.00 | $75.00 | $1.50 |
| OpenRouter | $15.00 | $75.00 | $1.50 |
| Vercel AI | $15.00 | $75.00 | — |
| Martian | $15.00 | $75.00 | $1.50 |
Benchmarks
Artificial Analysis
Intelligence IndexArtificial Analysis
72/100Coding IndexArtificial Analysis
68/100Math IndexArtificial Analysis
80/100MMLU-PROArtificial Analysis
0.795/1GPQA DiamondArtificial Analysis
0.715/1MATH-500Artificial Analysis
0.905/1AIME 2024Artificial Analysis
0.55/1Humanity's Last ExamArtificial Analysis
0.112/1LiveCodeBenchArtificial Analysis
0.685/1SciCodeArtificial Analysis
0.395/1Model IDs
Requesty
bedrock/claude-opus-4@us-east-1OpenRouter
anthropic/claude-opus-4Tags
visionreasoningtool-callingcachingcomputer-use
Available Regions
EUUS