GoogleGemini 2.5Sep 25, 2025

Google: Gemini 2.5 Flash Lite Preview 09-2025

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance across common benchmarks compared to earlier Flash models. By default, "thinking" (i.e. multi-pass reasoning) is disabled to prioritize speed, but developers can enable it via the Reasoning API parameter to selectively trade off cost for intelligence.

Context Window
1.0M
tokens
Max Output
66K
tokens
Released
Sep 25, 2025
Arena Rank
Output Speed
312
tokens/sec
Time to First Token
450ms
TTFT

Capabilities

👁Vision
🧠Reasoning
🔧Tool Calling
Prompt Caching
🖥Computer Use
🎨Image Generation

Supported Parameters

Include Reasoning
Show reasoning tokens
Max Tokens
Output length limit
Reasoning
Extended thinking
Response Format
JSON mode / structured output
Seed
Deterministic outputs
Stop Sequences
Custom stop tokens
structured_outputs
Temperature
Controls randomness
Tool Choice
Control tool usage
Tool Calling
Function calling support
Top P
Nucleus sampling

Pricing Comparison

RouterInput / 1MOutput / 1MCached Input / 1M
OpenRouter$0.10$0.40$0.01
Vercel AI$0.10$0.40
Martian$0.10$0.40

Benchmarks

Artificial Analysis
Intelligence IndexArtificial Analysis
65/100
Coding IndexArtificial Analysis
58/100
Math IndexArtificial Analysis
82/100
MMLU-PROArtificial Analysis
0.775/1
GPQA DiamondArtificial Analysis
0.658/1
MATH-500Artificial Analysis
0.928/1
AIME 2024Artificial Analysis
0.62/1
Humanity's Last ExamArtificial Analysis
0.065/1
LiveCodeBenchArtificial Analysis
0.598/1
SciCodeArtificial Analysis
0.315/1

Model IDs

OpenRoutergoogle/gemini-2.5-flash-lite-preview-09-2025

Tags

visionreasoningtool-calling
Compare with another model

Related Models