GLM 4.6
MoEGLM
Zhipu's 357B MoE. $0.6/M tokens via API; local deployment needs 8×H200 or equivalent multi-GPU with vLLM v0.19+. Not a consumer-GPU target.
Provider
Zhipu AI
Parameters
357B
Context
128K
Released
2026-04-08
VRAM Requirements by Quantization
| Method | Disk Size | VRAM Required | Fits GPUs |
|---|---|---|---|
| Q4_K_M | 180 GB | 195 GB | 0 GPUs |
| Q2_K | 95 GB | 105 GB | 0 GPUs |
Benchmark Scores
mmlu85.5%
humaneval86%
Scores are approximate and may vary by quantization level.
HuggingFace
zai-org/GLM-4.6