OpenAI Models — Pricing, Benchmarks & Capabilities

Metric	Input	Output
Cheapest	$0.02	$0.14
Average	$7.25	$32.63
Most Expensive	$150.00	$600.00

OpenAIGPT-4o

#21

Chatgpt 4o

OpenAI ChatGPT 4o is continually updated by OpenAI to point to the current version of GPT-4o used by ChatGPT. It therefore differs slightly from the API version of [GPT-4o](/models/openai/gpt-4o) in that it has additional RLHF. It is intended for research and evaluation. OpenAI notes that this model is not suited for production use-cases as it may be removed or redirected to another model in the future.

Context

128K

Max Output

16K

Input/1M

$5.00

👁 Vision🔧 Tools⚡ Cache

Pricing (per 1M tokens)

Requesty★$5.00 / $15.00

OpenRouter$5.00 / $15.00

Martian$5.00 / $15.00

2024-08-14View details →

OpenAIGPT-5.1

#23

GPT 5.1 Chat

GPT-5.1 Chat points to the GPT-5.1 snapshot currently used in ChatGPT. We recommend GPT-5.1 for most API usage, but feel free to use this GPT-5.1 Chat model to test our latest improvements for chat use cases.

Context

128K

Max Output

16K

Input/1M

$1.25

👁 Vision🧠 Reasoning🔧 Tools⚡ Cache

EUUS

Pricing (per 1M tokens)

Requesty★$1.25 / $10.00

Martian$1.25 / $10.00

View details →

OpenAIGPT-5.2

#25

GPT 5.2 Chat

GPT‑5.2 sets a new state of the art across many benchmarks, including GDPval, where it outperforms industry professionals at well-specified knowledge work tasks spanning 44 occupations.

Context

128K

Max Output

16K

Input/1M

$1.75

👁 Vision🧠 Reasoning🔧 Tools⚡ Cache

US

Pricing (per 1M tokens)

Requesty★$1.75 / $14.00

Vercel AI$1.75 / $14.00

Martian$1.75 / $14.00

2025-12-11View details →

OpenAIo3

#29

o3

o3 is a well-rounded and powerful model across domains. It sets a new standard for math, science, coding, and visual reasoning tasks. It also excels at technical writing and instruction-following. Use it to think through multi-step problems that involve analysis across text, code, and images.

Context

200K

Max Output

100K

Input/1M

$1.00

👁 Vision🧠 Reasoning🔧 Tools⚡ Cache

Pricing (per 1M tokens)

Requesty cheapest

Requesty★$1.00 / $4.00

OpenRouter$2.00 / $8.00

Vercel AI$2.00 / $8.00

Martian$2.00 / $8.00

2025-04-16View details →

OpenAIGPT-5

#32

GPT-5 Chat

GPT-5 is OpenAI's flagship model for coding, reasoning, and agentic tasks across domains.

Context

128K

Max Output

16K

Input/1M

$1.25

👁 Vision🧠 Reasoning🔧 Tools⚡ Cache

EUUS

Pricing (per 1M tokens)

Requesty★$1.25 / $10.00

Vercel AI$1.25 / $10.00

Martian$1.25 / $10.00

2025-08-01View details →

OpenAIGPT-4.1

#52

GPT-4.1

GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and GPT-4.5 across coding (54.6% SWE-bench Verified), instruction compliance (87.4% IFEval), and multimodal understanding benchmarks. It is tuned for precise code diffs, agent reliability, and high recall in large document contexts, making it ideal for agents, IDE tooling, and enterprise knowledge retrieval.

Context

1.0M

Max Output

33K

Input/1M

$2.00

👁 Vision🔧 Tools⚡ Cache

EUUS

Pricing (per 1M tokens)

Requesty★$2.00 / $8.00

OpenRouter$2.00 / $8.00

Vercel AI$2.00 / $8.00

Martian$2.00 / $8.00

2025-04-14View details →

OpenAIo4

#75

o4 Mini

OpenAI o4-mini is a compact reasoning model in the o-series, optimized for fast, cost-efficient performance while retaining strong multimodal and agentic capabilities. It supports tool use and demonstrates competitive reasoning and coding performance across benchmarks like AIME (99.5% with Python) and SWE-bench, outperforming its predecessor o3-mini and even approaching o3 in some domains. Despite its smaller size, o4-mini exhibits high accuracy in STEM tasks, visual problem solving (e.g., MathVista, MMMU), and code editing. It is especially well-suited for high-throughput scenarios where latency or cost is critical. Thanks to its efficient architecture and refined reinforcement learning training, o4-mini can chain tools, generate structured outputs, and solve multi-step tasks with minimal delay—often in under a minute.

Context

200K

Max Output

100K

Input/1M

$1.10

🧠 Reasoning🔧 Tools⚡ Cache

EUUS

Pricing (per 1M tokens)

Requesty★$1.10 / $4.40

OpenRouter$1.10 / $4.40

Vercel AI$1.10 / $4.40

Martian$1.10 / $4.40

2025-04-16View details →

OpenAIo1

#79

o1

The latest and strongest model family from OpenAI, o1 is designed to spend more time thinking before responding. The o1 model series is trained with large-scale reinforcement learning to reason using chain of thought. The o1 models are optimized for math, science, programming, and other STEM-related tasks. They consistently exhibit PhD-level accuracy on benchmarks in physics, chemistry, and biology. Learn more in the [launch announcement](https://openai.com/o1).

Context

200K

Max Output

100K

Input/1M

$15.00

👁 Vision🧠 Reasoning🔧 Tools⚡ Cache

Pricing (per 1M tokens)

Requesty★$15.00 / $60.00

OpenRouter$15.00 / $60.00

Vercel AI$15.00 / $60.00

Martian$15.00 / $60.00

2024-12-17View details →

OpenAIGPT-4.1

#87

GPT-4.1 Mini

GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 million token context window and scores 45.1% on hard instruction evals, 35.8% on MultiChallenge, and 84.1% on IFEval. Mini also shows strong coding ability (e.g., 31.6% on Aider’s polyglot diff benchmark) and vision understanding, making it suitable for interactive applications with tight performance constraints.

Context

1.0M

Max Output

33K

Input/1M

$0.40

👁 Vision🔧 Tools⚡ Cache

EUUS

Pricing (per 1M tokens)

Requesty★$0.40 / $1.60

OpenRouter$0.40 / $1.60

Vercel AI$0.40 / $1.60

Martian$0.40 / $1.60

2025-04-14View details →

OpenAIo3

#101

OpenAI: o3 Mini High

OpenAI o3-mini-high is the same model as [o3-mini](/openai/o3-mini) with reasoning_effort set to high. o3-mini is a cost-efficient language model optimized for STEM reasoning tasks, particularly excelling in science, mathematics, and coding. The model features three adjustable reasoning effort levels and supports key developer capabilities including function calling, structured outputs, and streaming, though it does not include vision processing capabilities. The model demonstrates significant improvements over its predecessor, with expert testers preferring its responses 56% of the time and noting a 39% reduction in major errors on complex questions. With medium reasoning effort settings, o3-mini matches the performance of the larger o1 model on challenging reasoning evaluations like AIME and GPQA, while maintaining lower latency and cost.

Context

200K

Max Output

100K

Input/1M

$1.10

🔧 Tools⚡ Cache

Pricing (per 1M tokens)

OpenRouter$1.10 / $4.40

2025-02-12View details →

OpenAI

#108

OpenAI: gpt-oss-120b (free)

gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-reasoning, agentic, and general-purpose production use cases. It activates 5.1B parameters per forward pass and is optimized to run on a single H100 GPU with native MXFP4 quantization. The model supports configurable reasoning depth, full chain-of-thought access, and native tool use, including function calling, browsing, and structured output generation.

Context

131K

Max Output

131K

Input/1M

Free

🧠 Reasoning🔧 Tools

Pricing (per 1M tokens)

OpenRouterFree / Free

Vercel AI$0.10 / $0.50

DeepInfra$0.15 / $0.60

2025-08-05View details →

OpenAIo3

#115

o3 Mini

OpenAI o3-mini is a cost-efficient language model optimized for STEM reasoning tasks, particularly excelling in science, mathematics, and coding. This model supports the `reasoning_effort` parameter, which can be set to "high", "medium", or "low" to control the thinking time of the model. The default is "medium". OpenRouter also offers the model slug `openai/o3-mini-high` to default the parameter to "high". The model features three adjustable reasoning effort levels and supports key developer capabilities including function calling, structured outputs, and streaming, though it does not include vision processing capabilities. The model demonstrates significant improvements over its predecessor, with expert testers preferring its responses 56% of the time and noting a 39% reduction in major errors on complex questions. With medium reasoning effort settings, o3-mini matches the performance of the larger o1 model on challenging reasoning evaluations like AIME and GPQA, while maintaining lower latency and cost.

Context

200K

Max Output

100K

Input/1M

$1.10

🧠 Reasoning🔧 Tools⚡ Cache

Pricing (per 1M tokens)

Requesty★$1.10 / $4.40

OpenRouter$1.10 / $4.40

Vercel AI$1.10 / $4.40

Martian$1.10 / $4.40

2025-01-31View details →

OpenAIGPT-4o

#134

GPT-4o

The 2024-11-20 version of GPT-4o offers a leveled-up creative writing ability with more natural, engaging, and tailored writing to improve relevance & readability. It’s also better at working with uploaded files, providing deeper insights & more thorough responses. GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and image inputs with text outputs. It maintains the intelligence level of [GPT-4 Turbo](/models/openai/gpt-4-turbo) while being twice as fast and 50% more cost-effective. GPT-4o also offers improved performance in processing non-English languages and enhanced visual capabilities.

Context

128K

Max Output

16K

Input/1M

$2.50

👁 Vision🔧 Tools⚡ Cache

Pricing (per 1M tokens)

Requesty★$2.50 / $10.00

Vercel AI$2.50 / $10.00

Martian$2.50 / $10.00

2024-05-13View details →

OpenAIGPT-4

#145

OpenAI: GPT-4 Turbo

The latest GPT-4 Turbo model with vision capabilities. Vision requests can now use JSON mode and function calling. Training data: up to December 2023.

Context

128K

Max Output

4K

Input/1M

$10.00

👁 Vision🔧 Tools

Pricing (per 1M tokens)

OpenRouter$10.00 / $30.00

Vercel AI$10.00 / $30.00

Martian$30.00 / $60.00

2024-04-09View details →

OpenAIGPT-4

#145

OpenAI: GPT-4 Turbo Preview

The preview GPT-4 model with improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more. Training data: up to Dec 2023. **Note:** heavily rate limited by OpenAI while in preview.

Context

128K

Max Output

4K

Input/1M

$10.00

🔧 Tools

Pricing (per 1M tokens)

OpenRouter$10.00 / $30.00

Martian$10.00 / $30.00

2024-01-25View details →

OpenAIGPT-4.1

#151

GPT-4.1 Nano

For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million token context window, and scores 80.1% on MMLU, 50.3% on GPQA, and 9.8% on Aider polyglot coding – even higher than GPT‑4o mini. It’s ideal for tasks like classification or autocompletion.

Context

1.0M

Max Output

33K

Input/1M

$0.10

👁 Vision🔧 Tools⚡ Cache

EUUS

Pricing (per 1M tokens)

Requesty★$0.10 / $0.40

OpenRouter$0.10 / $0.40

Vercel AI$0.10 / $0.40

Martian$0.10 / $0.40

2025-04-14View details →

OpenAIGPT-4o

#158

GPT-4o-mini (2024-07-18)

GPT-4o mini is OpenAI's newest model after [GPT-4 Omni](/models/openai/gpt-4o), supporting both text and image inputs with text outputs. As their most advanced small model, it is many multiples more affordable than other recent frontier models, and more than 60% cheaper than [GPT-3.5 Turbo](/models/openai/gpt-3.5-turbo). It maintains SOTA intelligence, while being significantly more cost-effective. GPT-4o mini achieves an 82% score on MMLU and presently ranks higher than GPT-4 on chat preferences [common leaderboards](https://arena.lmsys.org/). Check out the [launch announcement](https://openai.com/index/gpt-4o-mini-advancing-cost-efficient-intelligence/) to learn more. #multimodal

Context

128K

Max Output

16K

Input/1M

$0.15

👁 Vision🔧 Tools⚡ Cache

Pricing (per 1M tokens)

Requesty★$0.15 / $0.60

OpenRouter$0.15 / $0.60

Vercel AI$0.15 / $0.60

Martian$0.15 / $0.60

2024-07-18View details →

OpenAI

#159

OpenAI: gpt-oss-20b (free)

gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.0 license. It uses a Mixture-of-Experts (MoE) architecture with 3.6B active parameters per forward pass, optimized for lower-latency inference and deployability on consumer or single-GPU hardware. The model is trained in OpenAI’s Harmony response format and supports reasoning level configuration, fine-tuning, and agentic capabilities including function calling, tool use, and structured outputs.

Context

131K

Max Output

131K

Input/1M

Free

🧠 Reasoning🔧 Tools

Pricing (per 1M tokens)

OpenRouterFree / Free

Vercel AI$0.07 / $0.30

DeepInfra$0.03 / $0.14

2025-08-05View details →

OpenAIGPT-4

#165

OpenAI: GPT-4 Turbo (older v1106)

The latest GPT-4 Turbo model with vision capabilities. Vision requests can now use JSON mode and function calling. Training data: up to April 2023.

Context

128K

Max Output

4K

Input/1M

$10.00

🔧 Tools

Pricing (per 1M tokens)

OpenRouter$10.00 / $30.00

Martian$10.00 / $30.00

2023-11-06View details →

OpenAIGPT-4

#186

OpenAI: GPT-4 (older v0314)

GPT-4-0314 is the first version of GPT-4 released, with a context length of 8,192 tokens, and was supported until June 14. Training data: up to Sep 2021.

Context

8K

Max Output

4K

Input/1M

$30.00

🔧 Tools

Pricing (per 1M tokens)

OpenRouter$30.00 / $60.00

2023-05-28View details →

OpenAIGPT-3.5

#224

OpenAI: GPT-3.5 Turbo Instruct

This model is a variant of GPT-3.5 Turbo tuned for instructional prompts and omitting chat-related optimizations. Training data: up to Sep 2021.

Context

4K

Max Output

4K

Input/1M

$0.50

Pricing (per 1M tokens)

OpenRouter$1.50 / $2.00

Vercel AI$0.50 / $1.50

Martian$0.50 / $1.50

2023-09-28View details →

OpenAI

OpenAI: GPT Audio

The gpt-audio model is OpenAI's first generally available audio model. The new snapshot features an upgraded decoder for more natural sounding voices and maintains better voice consistency. Audio is priced at $32 per million input tokens and $64 per million output tokens.

Context

128K

Max Output

16K

Input/1M

$2.50

Pricing (per 1M tokens)

OpenRouter$2.50 / $10.00

2026-01-19View details →

OpenAI

OpenAI: GPT Audio Mini

A cost-efficient version of GPT Audio. The new snapshot features an upgraded decoder for more natural sounding voices and maintains better voice consistency. Input is priced at $0.60 per million tokens and output is priced at $2.40 per million tokens.

Context

128K

Max Output

16K

Input/1M

$0.60

Pricing (per 1M tokens)

OpenRouter$0.60 / $2.40

2026-01-19View details →

OpenAIGPT-5.2

GPT 5.2 Codex

GPT-5.2-Codex is an upgraded version of GPT-5.1-Codex optimized for software engineering and coding workflows. It is designed for both interactive development sessions and long, independent execution of complex engineering tasks. The model supports building projects from scratch, feature development, debugging, large-scale refactoring, and code review. Compared to GPT-5.1-Codex, 5.2-Codex is more steerable, adheres closely to developer instructions, and produces cleaner, higher-quality code outputs. Reasoning effort can be adjusted with the `reasoning.effort` parameter. Read the [docs here](https://openrouter.ai/docs/use-cases/reasoning-tokens#reasoning-effort-level) Codex integrates into developer environments including the CLI, IDE extensions, GitHub, and cloud tasks. It adapts reasoning effort dynamically—providing fast responses for small tasks while sustaining extended multi-hour runs for large projects. The model is trained to perform structured code reviews, catching critical flaws by reasoning over dependencies and validating behavior against tests. It also supports multimodal inputs such as images or screenshots for UI development and integrates tool use for search, dependency installation, and environment setup. Codex is intended specifically for agentic coding applications.

Context

400K

Max Output

128K

Input/1M

$1.75

👁 Vision🧠 Reasoning🔧 Tools⚡ Cache

Pricing (per 1M tokens)

Requesty★$1.75 / $14.00

OpenRouter$1.75 / $14.00

Vercel AI$1.75 / $14.00

Martian$1.75 / $14.00

2026-01-14View details →

OpenAIGPT-5.2

OpenAI: GPT-5.2 Pro

GPT-5.2 Pro is OpenAI’s most advanced model, offering major improvements in agentic coding and long context performance over GPT-5 Pro. It is optimized for complex tasks that require step-by-step reasoning, instruction following, and accuracy in high-stakes use cases. It supports test-time routing features and advanced prompt understanding, including user-specified intent like "think hard about this." Improvements include reductions in hallucination, sycophancy, and better performance in coding, writing, and health-related tasks.

Context

400K

Max Output

128K

Input/1M

$21.00

👁 Vision🧠 Reasoning🔧 Tools

Pricing (per 1M tokens)

OpenRouter$21.00 / $168.00

Vercel AI$21.00 / $168.00

Martian$21.00 / $168.00

2025-12-10View details →

OpenAIGPT-5.1

OpenAI: GPT-5.1-Codex-Max

GPT-5.1-Codex-Max is OpenAI’s latest agentic coding model, designed for long-running, high-context software development tasks. It is based on an updated version of the 5.1 reasoning stack and trained on agentic workflows spanning software engineering, mathematics, and research. GPT-5.1-Codex-Max delivers faster performance, improved reasoning, and higher token efficiency across the development lifecycle.

Context

400K

Max Output

128K

Input/1M

$1.25

👁 Vision🧠 Reasoning🔧 Tools⚡ Cache

Pricing (per 1M tokens)

OpenRouter$1.25 / $10.00

Vercel AI$1.25 / $10.00

Martian$1.25 / $10.00

2025-12-04View details →

OpenAIGPT-5.1

GPT 5.1 Codex

GPT-5.1-Codex is a specialized version of GPT-5.1 optimized for software engineering and coding workflows. It is designed for both interactive development sessions and long, independent execution of complex engineering tasks. The model supports building projects from scratch, feature development, debugging, large-scale refactoring, and code review. Compared to GPT-5.1, Codex is more steerable, adheres closely to developer instructions, and produces cleaner, higher-quality code outputs. Reasoning effort can be adjusted with the `reasoning.effort` parameter. Read the [docs here](https://openrouter.ai/docs/use-cases/reasoning-tokens#reasoning-effort-level) Codex integrates into developer environments including the CLI, IDE extensions, GitHub, and cloud tasks. It adapts reasoning effort dynamically—providing fast responses for small tasks while sustaining extended multi-hour runs for large projects. The model is trained to perform structured code reviews, catching critical flaws by reasoning over dependencies and validating behavior against tests. It also supports multimodal inputs such as images or screenshots for UI development and integrates tool use for search, dependency installation, and environment setup. Codex is intended specifically for agentic coding applications.

Context

400K

Max Output

128K

Input/1M

$1.25

👁 Vision🧠 Reasoning🔧 Tools⚡ Cache

Pricing (per 1M tokens)

Requesty★$1.25 / $10.00

OpenRouter$1.25 / $10.00

Vercel AI$1.25 / $10.00

Martian$1.25 / $10.00

2025-11-13View details →

OpenAIGPT-5.1

OpenAI: GPT-5.1-Codex-Mini

GPT-5.1-Codex-Mini is a smaller and faster version of GPT-5.1-Codex

Context

400K

Max Output

100K

Input/1M

$0.25

👁 Vision🧠 Reasoning🔧 Tools⚡ Cache

Pricing (per 1M tokens)

OpenRouter$0.25 / $2.00

Vercel AI$0.25 / $2.00

Martian$0.25 / $2.00

2025-11-13View details →

OpenAIGPT-5.1

GPT-5.1 Instant

GPT-5.1 Instant (or GPT-5.1 chat) is a warmer and more conversational version of GPT-5-chat, with improved instruction following and adaptive reasoning for deciding when to think before responding.

Context

128K

Max Output

16K

Input/1M

$1.25

🧠 Reasoning🔧 Tools

Pricing (per 1M tokens)

Vercel AI$1.25 / $10.00

2025-11-12View details →

OpenAIGPT-5.1

GPT 5.1 Thinking

An upgraded version of GPT-5 that adapts thinking time more precisely to the question to spend more time on complex questions and respond more quickly to simpler tasks.

Context

400K

Max Output

128K

Input/1M

$1.25

🧠 Reasoning🔧 Tools

Pricing (per 1M tokens)

Vercel AI$1.25 / $10.00

2025-11-12View details →

OpenAI

OpenAI: gpt-oss-safeguard-20b

gpt-oss-safeguard-20b is a safety reasoning model from OpenAI built upon gpt-oss-20b. This open-weight, 21B-parameter Mixture-of-Experts (MoE) model offers lower latency for safety tasks like content classification, LLM filtering, and trust & safety labeling. Learn more about this model in OpenAI's gpt-oss-safeguard [user guide](https://cookbook.openai.com/articles/gpt-oss-safeguard-guide).

Context

131K

Max Output

66K

Input/1M

$0.07

🧠 Reasoning🔧 Tools⚡ Cache

Pricing (per 1M tokens)

OpenRouter$0.07 / $0.30

Vercel AI$0.07 / $0.30

2025-10-29View details →

OpenAIGPT-5

OpenAI: GPT-5 Image Mini

GPT-5 Image Mini combines OpenAI's advanced language capabilities, powered by [GPT-5 Mini](https://openrouter.ai/openai/gpt-5-mini), with GPT Image 1 Mini for efficient image generation. This natively multimodal model features superior instruction following, text rendering, and detailed image editing with reduced latency and cost. It excels at high-quality visual creation while maintaining strong text understanding, making it ideal for applications that require both efficient image generation and text processing at scale.

Context

400K

Max Output

128K

Input/1M

$2.50

👁 Vision🧠 Reasoning🔧 Tools⚡ Cache

Pricing (per 1M tokens)

OpenRouter$2.50 / $2.00

2025-10-16View details →

OpenAIGPT-5

OpenAI: GPT-5 Image

[GPT-5](https://openrouter.ai/openai/gpt-5) Image combines OpenAI's GPT-5 model with state-of-the-art image generation capabilities. It offers major improvements in reasoning, code quality, and user experience while incorporating GPT Image 1's superior instruction following, text rendering, and detailed image editing.

Context

400K

Max Output

128K

Input/1M

$10.00

👁 Vision🧠 Reasoning🔧 Tools⚡ Cache

Pricing (per 1M tokens)

OpenRouter$10.00 / $10.00

2025-10-14View details →

OpenAIo3

O3 Deep Research

o3-deep-research is OpenAI's advanced model for deep research, designed to tackle complex, multi-step research tasks. Note: This model always uses the 'web_search' tool which adds additional cost.

Context

200K

Max Output

200K

Input/1M

$10.00

👁 Vision🧠 Reasoning🔧 Tools⚡ Cache

Pricing (per 1M tokens)

Requesty★$10.00 / $40.00

OpenRouter$10.00 / $40.00

Vercel AI$10.00 / $40.00

2025-10-10View details →

OpenAIo4

O4 Mini Deep Research

o4-mini-deep-research is OpenAI's faster, more affordable deep research model—ideal for tackling complex, multi-step research tasks. Note: This model always uses the 'web_search' tool which adds additional cost.

Context

200K

Max Output

200K

Input/1M

$2.00

👁 Vision🧠 Reasoning🔧 Tools⚡ Cache

Pricing (per 1M tokens)

Requesty★$2.00 / $8.00

OpenRouter$2.00 / $8.00

Martian$2.00 / $8.00

2025-10-10View details →

OpenAIGPT-5

GPT 5 Pro

GPT-5 Pro is OpenAI’s extended-reasoning tier of GPT-5, built to push reliability on hard problems, long tool chains, and agentic workflows. It keeps GPT-5’s multimodal skills and very large context (API page lists up to 400K tokens) while allocating more compute to think longer and plan better, improving code generation, math, and complex writing beyond standard GPT-5/“Thinking.” OpenAI positions Pro as the version that “uses extended reasoning for even more comprehensive and accurate answers,” targeting high-stakes tasks and enterprise use.

Context

400K

Max Output

272K

Input/1M

$15.00

👁 Vision🧠 Reasoning🔧 Tools

Pricing (per 1M tokens)

Requesty★$15.00 / $120.00

OpenRouter$15.00 / $120.00

Vercel AI$15.00 / $120.00

Martian$15.00 / $120.00

2025-10-06View details →

OpenAIGPT-5

GPT 5 Codex

GPT-5-Codex is a specialized version of GPT-5 optimized for software engineering and coding workflows. It is designed for both interactive development sessions and long, independent execution of complex engineering tasks. The model supports building projects from scratch, feature development, debugging, large-scale refactoring, and code review. Compared to GPT-5, Codex is more steerable, adheres closely to developer instructions, and produces cleaner, higher-quality code outputs. Reasoning effort can be adjusted with the `reasoning.effort` parameter. Read the [docs here](https://openrouter.ai/docs/use-cases/reasoning-tokens#reasoning-effort-level) Codex integrates into developer environments including the CLI, IDE extensions, GitHub, and cloud tasks. It adapts reasoning effort dynamically—providing fast responses for small tasks while sustaining extended multi-hour runs for large projects. The model is trained to perform structured code reviews, catching critical flaws by reasoning over dependencies and validating behavior against tests. It also supports multimodal inputs such as images or screenshots for UI development and integrates tool use for search, dependency installation, and environment setup. Codex is intended specifically for agentic coding applications.

Context

400K

Max Output

128K

Input/1M

$1.25

👁 Vision🧠 Reasoning🔧 Tools⚡ Cache

Pricing (per 1M tokens)

Requesty★$1.25 / $10.00

OpenRouter$1.25 / $10.00

Vercel AI$1.25 / $10.00

Martian$1.25 / $10.00

2025-09-23View details →

OpenAIGPT-4o

OpenAI: GPT-4o Audio

The gpt-4o-audio-preview model adds support for audio inputs as prompts. This enhancement allows the model to detect nuances within audio recordings and add depth to generated user experiences. Audio outputs are currently not supported. Audio tokens are priced at $40 per million input and $80 per million output audio tokens.

Context

128K

Max Output

16K

Input/1M

$2.50

🔧 Tools

Pricing (per 1M tokens)

OpenRouter$2.50 / $10.00

2025-08-15View details →

OpenAIGPT-5

GPT-5 Mini

GPT-5 Mini is a compact version of GPT-5, designed to handle lighter-weight reasoning tasks. It provides the same instruction-following and safety-tuning benefits as GPT-5, but with reduced latency and cost. GPT-5 Mini is the successor to OpenAI's o4-mini model.

Context

400K

Max Output

128K

Input/1M

$0.25

👁 Vision🧠 Reasoning🔧 Tools⚡ Cache

EUUS

Pricing (per 1M tokens)

Requesty★$0.25 / $2.00

OpenRouter$0.25 / $2.00

Vercel AI$0.25 / $2.00

Martian$0.25 / $2.00

2025-08-07View details →

OpenAIGPT-5

GPT 5 Nano

GPT-5-Nano is the smallest and fastest variant in the GPT-5 system, optimized for developer tools, rapid interactions, and ultra-low latency environments. While limited in reasoning depth compared to its larger counterparts, it retains key instruction-following and safety features. It is the successor to GPT-4.1-nano and offers a lightweight option for cost-sensitive or real-time applications.

Context

400K

Max Output

128K

Input/1M

$0.05

👁 Vision🧠 Reasoning🔧 Tools⚡ Cache

EUUS

Pricing (per 1M tokens)

Requesty★$0.05 / $0.40

OpenRouter$0.05 / $0.40

Vercel AI$0.05 / $0.40

Martian$0.05 / $0.40

2025-08-07View details →

OpenAIo3

O3 Pro

The o3 series of models are trained with reinforcement learning to perform complex reasoning. o1 models think before they answer, producing a long internal chain of thought before responding to the user. The o1 reasoning model is designed to solve hard problems across domains. The knowledge cutoff for o1 and o1-mini models is October, 2023.

Context

200K

Max Output

100K

Input/1M

$20.00

👁 Vision🧠 Reasoning🔧 Tools

Pricing (per 1M tokens)

Requesty★$20.00 / $80.00

OpenRouter$20.00 / $80.00

Vercel AI$20.00 / $80.00

Martian$20.00 / $80.00

2025-06-10View details →

OpenAI

Codex Mini

Codex Mini is a fine-tuned version of o4-mini specifically for use in Codex CLI.

Context

200K

Max Output

100K

Input/1M

$1.50

🧠 Reasoning🔧 Tools

Pricing (per 1M tokens)

Vercel AI$1.50 / $6.00

2025-05-16View details →

OpenAIo4

OpenAI: o4 Mini High

OpenAI o4-mini-high is the same model as [o4-mini](/openai/o4-mini) with reasoning_effort set to high. OpenAI o4-mini is a compact reasoning model in the o-series, optimized for fast, cost-efficient performance while retaining strong multimodal and agentic capabilities. It supports tool use and demonstrates competitive reasoning and coding performance across benchmarks like AIME (99.5% with Python) and SWE-bench, outperforming its predecessor o3-mini and even approaching o3 in some domains. Despite its smaller size, o4-mini exhibits high accuracy in STEM tasks, visual problem solving (e.g., MathVista, MMMU), and code editing. It is especially well-suited for high-throughput scenarios where latency or cost is critical. Thanks to its efficient architecture and refined reinforcement learning training, o4-mini can chain tools, generate structured outputs, and solve multi-step tasks with minimal delay—often in under a minute.

Context

200K

Max Output

100K

Input/1M

$1.10

👁 Vision🧠 Reasoning🔧 Tools⚡ Cache

Pricing (per 1M tokens)

OpenRouter$1.10 / $4.40

2025-04-16View details →

OpenAIo1

o1 Pro

The o1 series of models are trained with reinforcement learning to perform complex reasoning. o1 models think before they answer, producing a long internal chain of thought before responding to the user. The o1 reasoning model is designed to solve hard problems across domains. The knowledge cutoff for o1 and o1-mini models is October, 2023.

Context

200K

Max Output

100K

Input/1M

$150.00

👁 Vision🧠 Reasoning🔧 Tools

Pricing (per 1M tokens)

Requesty★$150.00 / $600.00

OpenRouter$150.00 / $600.00

Martian$150.00 / $600.00

2025-03-19View details →

OpenAIGPT-4o

OpenAI: GPT-4o-mini Search Preview

GPT-4o mini Search Preview is a specialized model for web search in Chat Completions. It is trained to understand and execute web search queries.

Context

128K

Max Output

16K

Input/1M

$0.15

Pricing (per 1M tokens)

OpenRouter$0.15 / $0.60

Vercel AI$0.15 / $0.60

Martian$0.15 / $0.60

2025-03-12View details →

OpenAIGPT-4o

OpenAI: GPT-4o Search Preview

GPT-4o Search Previewis a specialized model for web search in Chat Completions. It is trained to understand and execute web search queries.

Context

128K

Max Output

16K

Input/1M

$2.50

Pricing (per 1M tokens)

OpenRouter$2.50 / $10.00

Martian$2.50 / $10.00

2025-03-12View details →

OpenAIGPT-3.5

OpenAI: GPT-3.5 Turbo (older v0613)

GPT-3.5 Turbo is OpenAI's fastest model. It can understand and generate natural language or code, and is optimized for chat and traditional completion tasks. Training data up to Sep 2021.

Context

4K

Max Output

4K

Input/1M

$1.00

🔧 Tools

Pricing (per 1M tokens)

OpenRouter$1.00 / $2.00

2024-01-25View details →

OpenAI

text-embedding-3-large

OpenAI's most capable embedding model for both english and non-english tasks.

Context

0

Max Output

—

Input/1M

$0.13

Pricing (per 1M tokens)

Vercel AI$0.13 / Free

2024-01-25View details →

OpenAI

text-embedding-3-small

OpenAI's improved, more performant version of their ada embedding model.

Context

0

Max Output

—

Input/1M

$0.02

Pricing (per 1M tokens)

Vercel AI$0.02 / Free

2024-01-25View details →

OpenAIGPT-3.5

OpenAI: GPT-3.5 Turbo 16k

This model offers four times the context length of gpt-3.5-turbo, allowing it to support approximately 20 pages of text in a single request at a higher cost. Training data: up to Sep 2021.

Context

16K

Max Output

4K

Input/1M

$3.00

🔧 Tools

Pricing (per 1M tokens)

OpenRouter$3.00 / $4.00

2023-08-28View details →

OpenAI

text-embedding-ada-002

OpenAI's legacy text embedding model.

Context

0

Max Output

—

Input/1M

$0.10

Pricing (per 1M tokens)

Vercel AI$0.10 / Free

2022-12-15View details →

OpenAI

GPT Oss 20b

Context

131K

Max Output

33K

Input/1M

$0.03

🔧 Tools

Pricing (per 1M tokens)

Requesty★$0.10 / $0.50

Martian$0.03 / $0.14

View details →

OpenAI

GPT Oss 120b

Context

131K

Max Output

33K

Input/1M

$0.04

🔧 Tools

Pricing (per 1M tokens)

Requesty★$0.15 / $0.75

Martian$0.04 / $0.19

View details →

OpenAIGPT-4.1

GPT-4.1 Nano

For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million token context window, and scores 80.1% on MMLU, 50.3% on GPQA, and 9.8% on Aider polyglot coding – even higher than GPT‑4o mini. It’s ideal for tasks like classification or autocompletion.

Context

1.0M

Max Output

33K

Input/1M

$0.10

👁 Vision🔧 Tools⚡ Cache

EUUS

Pricing (per 1M tokens)

Requesty★$0.10 / $0.40

View details →

OpenAIGPT-4.1

GPT-4.1

GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and GPT-4.5 across coding (54.6% SWE-bench Verified), instruction compliance (87.4% IFEval), and multimodal understanding benchmarks. It is tuned for precise code diffs, agent reliability, and high recall in large document contexts, making it ideal for agents, IDE tooling, and enterprise knowledge retrieval.

Context

1.0M

Max Output

33K

Input/1M

$2.00

👁 Vision🔧 Tools⚡ Cache

EUUS

Pricing (per 1M tokens)

Requesty★$2.00 / $8.00

View details →

OpenAIGPT-5.2

GPT 5.2 Codex

OpenAI's most intelligent coding model optimized for long-horizon, agentic coding tasks.

Context

400K

Max Output

128K

Input/1M

$1.75

👁 Vision🧠 Reasoning🔧 Tools⚡ Cache

US

Pricing (per 1M tokens)

Requesty★$1.75 / $14.00

View details →

OpenAIGPT-4.1

GPT-4.1 Mini

GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 million token context window and scores 45.1% on hard instruction evals, 35.8% on MultiChallenge, and 84.1% on IFEval. Mini also shows strong coding ability (e.g., 31.6% on Aider’s polyglot diff benchmark) and vision understanding, making it suitable for interactive applications with tight performance constraints.

Context

1.0M

Max Output

33K

Input/1M

$0.40

👁 Vision🔧 Tools⚡ Cache

EUUS

Pricing (per 1M tokens)

Requesty★$0.40 / $1.60

View details →

OpenAI

$ Pricing Summary(per 1M tokens)

⚙ Capabilities

🤖 All OpenAI Models(57)

Chatgpt 4o

GPT 5.1 Chat

GPT 5.2 Chat

o3

GPT-5 Chat

GPT-4.1

o4 Mini

o1

GPT-4.1 Mini

OpenAI: o3 Mini High

OpenAI: gpt-oss-120b (free)

o3 Mini

GPT-4o

OpenAI: GPT-4 Turbo

OpenAI: GPT-4 Turbo Preview

GPT-4.1 Nano

GPT-4o-mini (2024-07-18)

OpenAI: gpt-oss-20b (free)

OpenAI: GPT-4 Turbo (older v1106)

OpenAI: GPT-4 (older v0314)

OpenAI: GPT-3.5 Turbo Instruct

OpenAI: GPT Audio

OpenAI: GPT Audio Mini

GPT 5.2 Codex

OpenAI: GPT-5.2 Pro

OpenAI: GPT-5.1-Codex-Max

GPT 5.1 Codex

OpenAI: GPT-5.1-Codex-Mini

GPT-5.1 Instant

GPT 5.1 Thinking

OpenAI: gpt-oss-safeguard-20b

OpenAI: GPT-5 Image Mini

OpenAI: GPT-5 Image

O3 Deep Research

O4 Mini Deep Research

GPT 5 Pro

GPT 5 Codex

OpenAI: GPT-4o Audio

GPT-5 Mini

GPT 5 Nano

O3 Pro

Codex Mini

OpenAI: o4 Mini High

o1 Pro

OpenAI: GPT-4o-mini Search Preview

OpenAI: GPT-4o Search Preview

OpenAI: GPT-3.5 Turbo (older v0613)

text-embedding-3-large

text-embedding-3-small

OpenAI: GPT-3.5 Turbo 16k

text-embedding-ada-002

GPT Oss 20b

GPT Oss 120b

GPT-4.1 Nano

GPT-4.1

GPT 5.2 Codex

GPT-4.1 Mini

OpenAI

$ Pricing Summary(per 1M tokens)

⚙ Capabilities

🤖 All OpenAI Models(57)

Chatgpt 4o

GPT 5.1 Chat

GPT 5.2 Chat

o3

GPT-5 Chat

GPT-4.1

o4 Mini

o1

GPT-4.1 Mini

OpenAI: o3 Mini High

OpenAI: gpt-oss-120b (free)

o3 Mini

GPT-4o

OpenAI: GPT-4 Turbo

OpenAI: GPT-4 Turbo Preview