Ternary Bonsai 8B
Apache 2.0
1.58-bit ternary quantization — weights are only {-1, 0, +1}. Memory footprint ~1/9 of FP16 at the same parameter count. MLX 2-bit packed format today; other backends coming soon.
Provider
PrismML
Parameters
8B
Context
32.768K
Released
2026-04-17
VRAM Requirements by Quantization
| Method | Disk Size | VRAM Required | Fits GPUs |
|---|---|---|---|
| 1.58-bit (MLX) | 1.6 GB | 2 GB | 15 GPUs |
Benchmark Scores
mmlu68%
humaneval58.5%
Scores are approximate and may vary by quantization level.
Compatible GPUs (15)
AMD RX 7900 GRE (16GB)AMD RX 7900 XTX (24GB)Apple M4 Pro (24GB) (24GB)Apple M3 Max (36GB) (36GB)Apple M4 Max (48GB) (48GB)NVIDIA RTX 4060 (8GB)NVIDIA RTX 4070 SUPER (12GB)NVIDIA RTX 3080 12GB (12GB)NVIDIA RTX 4080 SUPER (16GB)NVIDIA RTX 4060 Ti 16GB (16GB)NVIDIA RTX 4070 Ti SUPER (16GB)NVIDIA RTX 5080 (16GB)NVIDIA RTX 3090 (24GB)NVIDIA RTX 4090 (24GB)NVIDIA RTX 5090 (32GB)
HuggingFace
PrismML/ternary-bonsai-8b