Gemma 4 E2B
MoEApache 2.0
Google's ultra-compact multimodal MoE. Only 2.3B active params with full text/image/audio support. Lowest VRAM entry point in the Gemma 4 family.
Provider
Parameters
2.3B active / 5B total
Context
131.072K
Released
2026-04-08
VRAM Requirements by Quantization
| Method | Disk Size | VRAM Required | Fits GPUs |
|---|---|---|---|
| Q8_0 | 5 GB | 6 GB | 15 GPUs |
| Q4_K_M | 2.8 GB | 3.5 GB | 15 GPUs |
| Q4_0 | 2.6 GB | 3.2 GB | 15 GPUs |
Install with Ollama
Benchmark Scores
mmlu72%
humaneval52%
Scores are approximate and may vary by quantization level.
Compatible GPUs (15)
AMD RX 7900 GRE (16GB)AMD RX 7900 XTX (24GB)Apple M4 Pro (24GB) (24GB)Apple M3 Max (36GB) (36GB)Apple M4 Max (48GB) (48GB)NVIDIA RTX 4060 (8GB)NVIDIA RTX 4070 SUPER (12GB)NVIDIA RTX 3080 12GB (12GB)NVIDIA RTX 4080 SUPER (16GB)NVIDIA RTX 4060 Ti 16GB (16GB)NVIDIA RTX 4070 Ti SUPER (16GB)NVIDIA RTX 5080 (16GB)NVIDIA RTX 3090 (24GB)NVIDIA RTX 4090 (24GB)NVIDIA RTX 5090 (32GB)
HuggingFace
google/gemma-4-e2b-it