⚡ Weekly · Free · No spam

runlocal weekly

Every week: new model releases, benchmark comparisons, hardware tips, and Ollama guides. For developers who run AI locally.

Unsubscribe any time. No ads. Just signal.

🚀

New releases

Qwen 3.6, Gemma 4, GLM-5

📊

Benchmarks

88% accuracy at 17GB VRAM

🔧

Setup guides

Claude Code + Ollama, Pi 5 RAG

Latest issues

All posts →

Issue #12May 27, 2026

A FLUX-class 4B image model, squeezed to 1.21 GB — and yes, it runs in your browser

PrismML's Bonsai Image quantizes a FLUX.2-derived diffusion transformer down to sub-2-bit weights, with MLX, CUDA, WebGPU and iPhone builds. The footprint and speed are real; the quality cost is real too. Here's the honest tradeoff — official numbers, our own benchmarks pending.

Issue #11May 21, 2026

Claude shipped 'Agent Skills'. r/LocalLLaMA already converged on the canonical 4 — copy these into your local agent today.

Plan-first, test-first, refactor-with-constraint, debug-loop. Four skill files that should ship with every local-LLM coding agent — sourced from the threads that actually got Qwen3.6 27B to daily-driver quality.

Issue #10May 21, 2026

Stop trying to use Cline locally. r/LocalLLaMA's real answer for daily-driving Qwen3.6 27B + MTP.

Cloud agents fall apart on local models. Three scaffold-first tools the community is actually shipping with — SmallCode, PI Coding Agent, and little-coder — plus a decision matrix by VRAM.