MetaLlama 4Open SourceJan 30, 2025

Llama 4 Maverick 17b 128e Instruct Fp8

A lightweight and ultra-fast variant of Llama 3.3 70B, for use when quick response times are needed most.

Context Window
1.0M
tokens
Max Output
1.0M
tokens
Released
Arena Rank
Output Speed
165
tokens/sec
Time to First Token
480ms
TTFT

Capabilities

👁Vision
🧠Reasoning
🔧Tool Calling
Prompt Caching
🖥Computer Use
🎨Image Generation

Pricing Comparison

RouterInput / 1MOutput / 1MCached Input / 1M
Requesty$0.20$0.85$0.20

Benchmarks

Artificial Analysis
Intelligence IndexArtificial Analysis
58/100
Coding IndexArtificial Analysis
52/100
Math IndexArtificial Analysis
62/100
MMLU-PROArtificial Analysis
0.748/1
GPQA DiamondArtificial Analysis
0.585/1
MATH-500Artificial Analysis
0.802/1
AIME 2024Artificial Analysis
0.18/1
LiveCodeBenchArtificial Analysis
0.478/1

Model IDs

Requestynovita/meta-llama/llama-4-maverick-17b-128e-instruct-fp8
Compare with another model

Related Models