runlocal weekly
Every week: new model releases, benchmark comparisons, hardware tips, and Ollama guides. For developers who run AI locally.
Unsubscribe any time. No ads. Just signal.
New releases
Qwen 3.6, Gemma 4, GLM-5
Benchmarks
88% accuracy at 17GB VRAM
Setup guides
Claude Code + Ollama, Pi 5 RAG
Latest issues
All posts →A FLUX-class 4B image model, squeezed to 1.21 GB — and yes, it runs in your browser
PrismML's Bonsai Image quantizes a FLUX.2-derived diffusion transformer down to sub-2-bit weights, with MLX, CUDA, WebGPU and iPhone builds. The footprint and speed are real; the quality cost is real too. Here's the honest tradeoff — official numbers, our own benchmarks pending.
Claude shipped 'Agent Skills'. r/LocalLLaMA already converged on the canonical 4 — copy these into your local agent today.
Plan-first, test-first, refactor-with-constraint, debug-loop. Four skill files that should ship with every local-LLM coding agent — sourced from the threads that actually got Qwen3.6 27B to daily-driver quality.
Stop trying to use Cline locally. r/LocalLLaMA's real answer for daily-driving Qwen3.6 27B + MTP.
Cloud agents fall apart on local models. Three scaffold-first tools the community is actually shipping with — SmallCode, PI Coding Agent, and little-coder — plus a decision matrix by VRAM.