runlocal.devCheck My GPU →

Qwen 3.6 35B-A3B

MoEQwen

Alibaba's sparse MoE with 35B total and only 3B active params. Major jump on coding benchmarks vs Qwen 3.5, with inference cost closer to a 3B dense model. Abliterated variant also on HuggingFace.

Provider

Alibaba

Parameters

3B active / 35B total (MoE)

Context

131.072K

Released

2026-04-17

VRAM Requirements by Quantization

MethodDisk SizeVRAM RequiredFits GPUs
Q8_036 GB38 GB1 GPUs
Q4_K_M19.5 GB21 GB7 GPUs
Q4_018.5 GB20 GB7 GPUs
Q2_K11.5 GB13 GB12 GPUs

Install with Ollama

Run in terminal:

ollama pull qwen3.6:35b-a3b

Minimum 13GB VRAM required. Install Ollama from ollama.com

Benchmark Scores

mmlu83.5%
humaneval88.2%

Scores are approximate and may vary by quantization level.

Compatible GPUs (12)

HuggingFace

Qwen/Qwen3.6-35B-A3B-Instruct

View on HF →