⚡runlocal.devCheck My GPU →

Ternary Bonsai 8B

Apache 2.0

1.58-bit ternary quantization — weights are only {-1, 0, +1}. Memory footprint ~1/9 of FP16 at the same parameter count. MLX 2-bit packed format today; other backends coming soon.

Provider

PrismML

Parameters

8B

Context

32.768K

Released

2026-04-17

VRAM Requirements by Quantization

Method	Disk Size	VRAM Required	Fits GPUs
1.58-bit (MLX)	1.6 GB	2 GB	15 GPUs

Benchmark Scores

mmlu68%

humaneval58.5%

Scores are approximate and may vary by quantization level.

Compatible GPUs (15)

AMD RX 7900 GRE (16GB)AMD RX 7900 XTX (24GB)Apple M4 Pro (24GB) (24GB)Apple M3 Max (36GB) (36GB)Apple M4 Max (48GB) (48GB)NVIDIA RTX 4060 (8GB)NVIDIA RTX 4070 SUPER (12GB)NVIDIA RTX 3080 12GB (12GB)NVIDIA RTX 4080 SUPER (16GB)NVIDIA RTX 4060 Ti 16GB (16GB)NVIDIA RTX 4070 Ti SUPER (16GB)NVIDIA RTX 5080 (16GB)NVIDIA RTX 3090 (24GB)NVIDIA RTX 4090 (24GB)NVIDIA RTX 5090 (32GB)

HuggingFace

PrismML/ternary-bonsai-8b