OpenAIo3Arena #115Jan 31, 2025

o3 Mini

OpenAI o3-mini is a cost-efficient language model optimized for STEM reasoning tasks, particularly excelling in science, mathematics, and coding. This model supports the `reasoning_effort` parameter, which can be set to "high", "medium", or "low" to control the thinking time of the model. The default is "medium". OpenRouter also offers the model slug `openai/o3-mini-high` to default the parameter to "high". The model features three adjustable reasoning effort levels and supports key developer capabilities including function calling, structured outputs, and streaming, though it does not include vision processing capabilities. The model demonstrates significant improvements over its predecessor, with expert testers preferring its responses 56% of the time and noting a 39% reduction in major errors on complex questions. With medium reasoning effort settings, o3-mini matches the performance of the larger o1 model on challenging reasoning evaluations like AIME and GPQA, while maintaining lower latency and cost.

Context Window
200K
tokens
Max Output
100K
tokens
Released
Jan 31, 2025
Arena Rank
#115
of 305 models
Output Speed
154
tokens/sec
Time to First Token
14.9s
TTFT

Capabilities

👁Vision
🧠Reasoning
🔧Tool Calling
Prompt Caching
🖥Computer Use
🎨Image Generation

Supported Parameters

Max Tokens
Output length limit
Response Format
JSON mode / structured output
Seed
Deterministic outputs
structured_outputs
Tool Choice
Control tool usage
Tool Calling
Function calling support

Pricing Comparison

RouterInput / 1MOutput / 1MCached Input / 1M
Requesty$1.10$4.40$0.55
OpenRouter$1.10$4.40$0.55
Vercel AI$1.10$4.40
Martian$1.10$4.40$0.55

Benchmarks

Artificial Analysis
Intelligence IndexArtificial Analysis
62.9/100
Coding IndexArtificial Analysis
55.8/100
Math IndexArtificial Analysis
87.2/100
MMLU-PROArtificial Analysis
0.791/1
GPQA DiamondArtificial Analysis
0.748/1
MATH-500Artificial Analysis
0.973/1
AIME 2024Artificial Analysis
0.77/1
Humanity's Last ExamArtificial Analysis
0.087/1
LiveCodeBenchArtificial Analysis
0.717/1
SciCodeArtificial Analysis
0.399/1

Model IDs

Requestyopenai/o3-mini:low
OpenRouteropenai/o3-mini

Tags

reasoningtool-callingcaching
Compare with another model

Related Models