MetaLlama 3.1Open SourceFeb 6, 2025

Meta Llama 3.1 405B Instruct

A lightweight and ultra-fast variant of Llama 3.3 70B, for use when quick response times are needed most.

Context Window
131K
tokens
Max Output
tokens
Released
Arena Rank
Output Speed
48
tokens/sec
Time to First Token
1.5s
TTFT

Capabilities

👁Vision
🧠Reasoning
🔧Tool Calling
Prompt Caching
🖥Computer Use
🎨Image Generation

Pricing Comparison

RouterInput / 1MOutput / 1MCached Input / 1M
Requesty$0.80$0.80$0.80

Benchmarks

Artificial Analysis
Intelligence IndexArtificial Analysis
51/100
Coding IndexArtificial Analysis
46/100
Math IndexArtificial Analysis
54/100
MMLU-PROArtificial Analysis
0.682/1
GPQA DiamondArtificial Analysis
0.488/1
MATH-500Artificial Analysis
0.738/1
AIME 2024Artificial Analysis
0.097/1
LiveCodeBenchArtificial Analysis
0.398/1
SciCodeArtificial Analysis
0.162/1

Model IDs

Requestydeepinfra/meta-llama/Meta-Llama-3.1-405B-Instruct
Compare with another model

Related Models