runlocal.devCheck My GPU →

Gemma 4 E4B

MoEGemma

Google's efficient 4B-active MoE model. Excellent performance per compute unit, runs on modest consumer hardware.

Provider

Google

Parameters

4B active (MoE)

Context

128K

Released

2026-04-08

VRAM Requirements by Quantization

MethodDisk SizeVRAM RequiredFits GPUs
Q8_04.5 GB5.5 GB15 GPUs
Q4_K_M2.6 GB3.5 GB15 GPUs
Q4_02.5 GB3.2 GB15 GPUs

Install with Ollama

Run in terminal:

ollama pull gemma4:e4b

Minimum 3.2GB VRAM required. Install Ollama from ollama.com

Benchmark Scores

mmlu74.2%
humaneval68.5%

Scores are approximate and may vary by quantization level.

Compatible GPUs (15)

HuggingFace

google/gemma-4-e4b-it

View on HF →