Issue #7May 20, 2026

OpenClaw + Ollama: the complete setup after Anthropic's third-party clampdown

Search interest in `openclaw ollama` is up 507,000% in 30 days. Here's the full migration path off Claude Pro — install, model pick, common errors, and what you actually give up.

If you've been on Claude Pro and watched your OpenClaw sessions stall this month, you're in the middle of the largest forced migration the local-AI ecosystem has seen. Google Trends puts openclaw ollama up +507,000% in 30 days. The driver isn't hype — it's Anthropic's April 2026 enforcement against unsanctioned third-party Claude Pro/Max wrappers. OpenClaw and OpenCode users started seeing 429s and capability restrictions unless they upgraded to direct API billing.

OpenClaw was already Ollama-friendly by design. The clampdown just made local the default escape hatch. This guide gets you there.

What you need

Requirement	Minimum	Recommended
OS	macOS 13+ / Ubuntu 22.04+ / Windows 11 (WSL2)	macOS 14+ Apple Silicon, or Linux + NVIDIA
RAM	16 GB	32 GB
GPU VRAM	8 GB (small models)	17 GB+ (27B-class coder models)
Disk	20 GB free	80 GB free (multiple models)
Node.js	20.x	22.x LTS

Apple Silicon with ≥24 GB unified memory handles Qwen 3.6 27B at Q4 comfortably. Intel Macs: skip local, use Gemini CLI free tier instead.

Step 1 — Install Ollama

curl -fsSL https://ollama.com/install.sh | sh

Windows users: install natively from ollama.com/download, not inside WSL. The WSL forwarding works but adds enough latency to ruin the agent loop on smaller models.

Verify and start:

ollama --version
ollama serve &   # local API on http://localhost:11434

Security note. Ollama's default bind is 127.0.0.1. Do not expose 0.0.0.0:11434 to the internet. The Bleeding Llama disclosure in May 2026 confirmed unauthenticated memory disclosure on older versions — upgrade to the latest patched release before exposing anything.

Step 2 — Pull a coding model

For first-time setup, pick one:

Model	Pull command	VRAM (Q4)	Best for
Qwen 3.5 9B	`ollama pull qwen3.5:9b`	6.5 GB	Entry tier, 16 GB MacBooks
Gemma 4 27B	`ollama pull gemma4:27b`	17 GB	Balanced quality, 4B active MoE = fast
Qwen 3.6 27B	`ollama pull qwen3.6:27b`	17 GB	Best coding pick — SWE-bench Verified 77.2, multimodal

Qwen 3.6 27B is the default recommendation. It posts HumanEval 88.5 and SWE-bench Verified 77.2 — beating Qwen 3.5 72B on HumanEval (82.7) while fitting in 17 GB at Q4 — and is explicitly designed to be compatible with Claude Code / Qwen Code tooling, which means OpenClaw's tool-use loop works out of the box. We compare these models in detail in issue #8.

Smoke test:

ollama run qwen3.6:27b "write a python function that reverses a linked list iteratively"

If you can't fit 17 GB of VRAM, see our GPU directory for what each card actually runs.

Step 3 — Install OpenClaw

Check the official OpenClaw repo for the current install command — the project has been renamed twice in the past year. As of 2026-05 the global install is:

pnpm add -g openclaw
# or: npm install -g openclaw

Verify the install:

openclaw --version
openclaw doctor

openclaw doctor checks Node version, Ollama reachability, and write permissions on ~/.openclaw/skills/. Fix anything red before continuing — Skills failing silently later is the most common "OpenClaw doesn't work" report.

Step 4 — Point OpenClaw at Ollama

~/.openclaw/config.json:

{
  "provider": "ollama",
  "endpoint": "http://localhost:11434",
  "model": "qwen3.6:27b",
  "context_window": 131072,
  "temperature": 0.2,
  "skills_dir": "~/.openclaw/skills"
}

Or via CLI:

openclaw config set provider ollama
openclaw config set model qwen3.6:27b
openclaw config set endpoint http://localhost:11434

Start a session:

openclaw

The banner should show provider: ollama and the model name. If it still says provider: anthropic, jump to "Common errors" below.

Step 5 — Install essential Skills

OpenClaw's value comes from its 800+ Skills registry. The minimum useful set for local-model setups:

openclaw skill install code-review
openclaw skill install git-helper
openclaw skill install file-explorer
openclaw skill install web-search       # uses local SearXNG by default
openclaw skill install benchmark        # measures local model latency

Browse the full registry with openclaw skill list --remote. Skill quality varies — code-review, git-helper, and file-explorer are the three that work reliably with any backend model.

Step 6 — First real task

cd ~/your-project
openclaw
> /file-explorer load src/
> refactor the authentication middleware to support OAuth2 PKCE flow

Expect 2–4× slower wall-clock than Claude Sonnet on the same task. No rate limits, no monthly cap, no token bill, code never leaves the machine.

Common errors

`Error: connection refused at 127.0.0.1:11434`

Ollama isn't running. Start it with ollama serve. Check with lsof -i :11434.

`Model not found: qwen3.6:27b`

Forgot the pull. Run ollama pull qwen3.6:27b.

`Context length exceeded` mid-session

Even a 131k window fills fast on big repos. Either switch to a model with a longer effective window via YaRN, or use OpenClaw's /compact Skill to summarize the running session.

OpenClaw still talking to Claude after the switch

Check ~/.openclaw/config.json for a stale anthropic block from a previous openclaw login. Run openclaw logout to clear cached credentials, then re-run the Step 4 config.

Tool calls hang or return empty

Smaller models (≤9B) struggle with multi-step tool use. Either move up to Qwen 3.6 27B, or limit parallelism: openclaw config set max_parallel_tools 1.

What you actually give up

Be honest with yourself before committing:

You lose	You gain
Claude Sonnet-level reasoning on complex refactors	Zero per-token cost
200k cloud context	Sovereignty over your codebase
Best-in-class tool-use reliability	No 429s, no monthly cap
Multimodal image input (on most local models)	Works offline

The pragmatic move: OpenClaw + Ollama for daily edits and refactors, fall back to Claude Code (direct API billing) twice a week for architectural sessions and tricky concurrency bugs. The cost ratio is roughly 1:20 in your favor.

Where to go next

Full head-to-head of the coding agents you could run on top of Ollama — Claude Code, Cursor, OpenCode, OpenClaw, Gemini CLI, Cluely, z.ai — in issue #8.
The May 2026 runtime breakthroughs (MTP, DFlash, PAGED MoE) that just made local agentic coding viable: issue #6.
Pick the right card for the model you want: runlocal GPU directory.

← All posts