For local deployment and low-latency applications. GLM-4.6V series are Z.ai’s iterations in a multimodal large language model. GLM-4.6V scales its context window to 128k tokens in training, and achieves SoTA performance in visual understanding among models of similar parameter scales.
| Router | Input / 1M | Output / 1M | Cached Input / 1M |
|---|---|---|---|
| Vercel AI | Free | Free | — |
Ranked by provider, pricing, capabilities, and arena performance
Same family · Both support reasoning & tools
Same family · Both support tools
Same family · Both support tools
Same family · Both support tools
Same family · Both support tools
Same family · Both support reasoning & tools