Gemma 4 E4B

MoEGemma

Google's efficient 4B-active MoE model. Excellent performance per compute unit, runs on modest consumer hardware.

Provider

Google

Parameters

4B active (MoE)

Context

128K

Released

2026-04-08

VRAM Requirements by Quantization

Method	Disk Size	VRAM Required	Fits GPUs
Q8_0	4.5 GB	5.5 GB	15 GPUs
Q4_K_M	2.6 GB	3.5 GB	15 GPUs
Q4_0	2.5 GB	3.2 GB	15 GPUs

Run in terminal:

ollama pull gemma4:e4b

Minimum 3.2GB VRAM required. Install Ollama from ollama.com

mmlu74.2%

humaneval68.5%

Scores are approximate and may vary by quantization level.

HuggingFace

google/gemma-4-e4b-it