Llama 4 Scout
MoELlama 4
Meta's efficient MoE model with an unprecedented 10M token context window. 17B active parameters from a 109B total pool.
Provider
Meta
Parameters
17B active (109B total MoE)
Context
10,485.76K
Released
2026-04-05
VRAM Requirements by Quantization
| Method | Disk Size | VRAM Required | Fits GPUs |
|---|---|---|---|
| Q4_K_M | 53 GB | 58 GB | 0 GPUs |
| Q4_0 | 50 GB | 55 GB | 0 GPUs |
| Q2_K | 32 GB | 35 GB | 1 GPUs |
Install with Ollama
Benchmark Scores
mmlu79.8%
humaneval75.3%
Scores are approximate and may vary by quantization level.
Compatible GPUs (1)
HuggingFace
meta-llama/Llama-4-Scout-17B-16E