Qwen 3.6 35B-A3B
MoEQwen
Alibaba's sparse MoE with 35B total and only 3B active params. Major jump on coding benchmarks vs Qwen 3.5, with inference cost closer to a 3B dense model. Abliterated variant also on HuggingFace.
Provider
Alibaba
Parameters
3B active / 35B total (MoE)
Context
131.072K
Released
2026-04-17
VRAM Requirements by Quantization
| Method | Disk Size | VRAM Required | Fits GPUs |
|---|---|---|---|
| Q8_0 | 36 GB | 38 GB | 1 GPUs |
| Q4_K_M | 19.5 GB | 21 GB | 7 GPUs |
| Q4_0 | 18.5 GB | 20 GB | 7 GPUs |
| Q2_K | 11.5 GB | 13 GB | 12 GPUs |
Install with Ollama
Run in terminal:
ollama pull qwen3.6:35b-a3bMinimum 13GB VRAM required. Install Ollama from ollama.com
Benchmark Scores
mmlu83.5%
humaneval88.2%
Scores are approximate and may vary by quantization level.
Compatible GPUs (12)
HuggingFace
Qwen/Qwen3.6-35B-A3B-Instruct