OpenAIGPT-4oAug 15, 2025
OpenAI: GPT-4o Audio
The gpt-4o-audio-preview model adds support for audio inputs as prompts. This enhancement allows the model to detect nuances within audio recordings and add depth to generated user experiences. Audio outputs are currently not supported. Audio tokens are priced at $40 per million input and $80 per million output audio tokens.
Context Window
128K
tokens
Max Output
16K
tokens
Released
Aug 15, 2025
Arena Rank
—
Output Speed
92
tokens/sec
Time to First Token
420ms
TTFT
Capabilities
👁Vision
🧠Reasoning
🔧Tool Calling
⚡Prompt Caching
🖥Computer Use
🎨Image Generation
Supported Parameters
Frequency Penalty
Reduce repetition
Logit Bias
Adjust token weights
Log Probs
Token probabilities
Max Tokens
Output length limit
Presence Penalty
Encourage new topics
Response Format
JSON mode / structured output
Seed
Deterministic outputs
Stop Sequences
Custom stop tokens
structured_outputs
Temperature
Controls randomness
Tool Choice
Control tool usage
Tool Calling
Function calling support
top_logprobs
Top P
Nucleus sampling
Pricing Comparison
| Router | Input / 1M | Output / 1M | Cached Input / 1M |
|---|---|---|---|
| OpenRouter | $2.50 | $10.00 | — |
Benchmarks
Artificial Analysis
Intelligence IndexArtificial Analysis
54.4/100Coding IndexArtificial Analysis
42.8/100Math IndexArtificial Analysis
60.8/100MMLU-PROArtificial Analysis
0.724/1GPQA DiamondArtificial Analysis
0.538/1MATH-500Artificial Analysis
0.763/1AIME 2024Artificial Analysis
0.097/1Humanity's Last ExamArtificial Analysis
0.013/1LiveCodeBenchArtificial Analysis
0.395/1SciCodeArtificial Analysis
0.158/1Model IDs
OpenRouter
openai/gpt-4o-audio-previewTags
tool-calling