ZhipuGLM 4Sep 30, 2025

GLM-4.6V-Flash

For local deployment and low-latency applications. GLM-4.6V series are Z.ai’s iterations in a multimodal large language model. GLM-4.6V scales its context window to 128k tokens in training, and achieves SoTA performance in visual understanding among models of similar parameter scales.

Context Window

128K

tokens

Max Output

24K

tokens

Released

Sep 30, 2025

Arena Rank

—

Capabilities

👁Vision

🧠Reasoning

🔧Tool Calling

⚡Prompt Caching

🖥Computer Use

🎨Image Generation

Pricing Comparison

Router	Input / 1M	Output / 1M	Cached Input / 1M
Vercel AI	Free	Free	—

Model IDs

Tags

visionreasoningfile-inputtool-useimplicit-caching

Compare with another model

Compare with…

GLM 4.7 FlashX GLM 4.5 Glm 4.6

Similar Models

Ranked by provider, pricing, capabilities, and arena performance

GLM 4.7 FlashX

200K ctx$0.06/1M in

Same family · Both support reasoning & tools

GLM 4.5

131K ctx$0.60/1M in

Same family · Both support tools

Glm 4.6

205K ctx$0.60/1M in

Same family · Both support tools

GLM 4.5 Air

131K ctx$0.20/1M in

Same family · Both support tools

Z.ai: GLM 4 32B

128K ctx$0.10/1M in

Same family · Both support tools

GLM 4.5

131K ctx$0.35/1M in

Same family · Both support reasoning & tools

← Back to all models