DeepClaude Tutorial: Claude Code on DeepSeek V4 Pro

DeepClaude runs Claude Code's agent loop on DeepSeek V4 Pro for a fraction of the cost. Setup, gotchas, and what actually breaks - tested.

Taylor Kim2026-05-078 min readBeginner

The interesting thing about DeepClaude isn’t that it’s cheaper. It’s that it proves Claude Code was never really Anthropic’s product – it was a UI shell with swappable guts, and somebody finally swapped them in public.

This landed in late April 2026, hit #1 on Hacker News, and the entire premise is a bash and PowerShell script. DeepClaude runs Claude Code’s agent loop with DeepSeek V4 Pro as the backend. Same terminal, same tool calls, same autonomous loops – different brain, a fraction of the bill. Here’s how to set it up, what quietly breaks, and which numbers in the hype actually hold up.

The reader scenario: you’re already burning through Claude Code credits

You’ve got Claude Code installed. You’re either on the $200/month Max 20x plan hitting weekly caps, or you’re on pay-as-you-go and watching $15/M output tokens (as of mid-2026 – verify current rates) evaporate during a single refactor. The agent loop is what makes Claude Code good – one instruction, thirty steps, and that cycle is what burns through tokens.

You don’t want a different tool. You want the same loop on a cheaper model. That’s exactly the gap DeepClaude fills.

What DeepClaude actually is (and isn’t)

It is not a fork of Claude Code. It is not a new agent framework. It’s a simple bash and PowerShell script – nothing fancy, no fork, no rewrite. The mechanism: Claude Code reads a handful of environment variables to decide where to send API calls. DeepClaude rewrites those for the duration of a session, points them at a cheaper backend, then restores the originals when you exit. The command line tool itself doesn’t change.

This is possible because on March 31, Anthropic accidentally shipped Claude Code’s full source map to npm, exposing 512,000 lines of TypeScript, which triggered a wave of clones and tools that hook into Claude Code’s now well-documented internals. DeepClaude is the natural consequence: not a hack, but a polite use of a backend that turned out to be swappable by design.

The default target is DeepSeek V4 Pro – a 1.6-trillion parameter Mixture-of-Experts model from DeepSeek, released on April 24, 2026 under the MIT license, with a context length of one million tokens.

Setup: under five minutes if you already have Claude Code

Two paths. Path A is the DeepClaude script. Path B is the official DeepSeek integration – just env vars, no wrapper. Both work. Path B is what the wrapper does under the hood.

Path A – the DeepClaude wrapper (recommended for switching backends)

Install Claude Code if you haven’t (npm install -g @anthropic-ai/claude-code).
Sign up at platform.deepseek.com, add $5 credit, copy your API key.
Clone aattaran/deepclaude and put the script on your PATH.
Export DEEPSEEK_API_KEY.
Run deepclaude. Done.

Path B – the official DeepSeek env-var route

Per DeepSeek’s own integration docs, you can skip the wrapper entirely:

export ANTHROPIC_BASE_URL=https://api.deepseek.com/anthropic
export ANTHROPIC_AUTH_TOKEN=<your DeepSeek API Key>
export ANTHROPIC_MODEL=deepseek-v4-pro
export ANTHROPIC_DEFAULT_OPUS_MODEL=deepseek-v4-pro
export ANTHROPIC_DEFAULT_SONNET_MODEL=deepseek-v4-pro
export ANTHROPIC_DEFAULT_HAIKU_MODEL=deepseek-v4-flash
export CLAUDE_CODE_SUBAGENT_MODEL=deepseek-v4-flash
export CLAUDE_CODE_EFFORT_LEVEL=max
claude

Notice the last three lines. The official config sets CLAUDE_CODE_SUBAGENT_MODEL to deepseek-v4-flash – the smaller Flash sibling, not Pro. Most write-ups paste this block without flagging it. Your subagents are not running V4 Pro. If you spawn a subagent expecting the full reasoning model, you’re getting Flash instead. Override it explicitly if that matters for your task.

Why the “17x cheaper” number is technically true and practically wrong

The headline math: DeepSeek V4 Pro through OpenRouter currently costs $0.435 per million input tokens and $0.87 per million output – a promotional rate that expires May 31, 2026, after which it doubles. Versus Anthropic’s output pricing, that’s where the 17x comes from. Every article stops there.

Agent loops are not output-heavy. They’re read-heavy. The model re-reads the system prompt, the file context, and the prior tool outputs on every single step. Thirty steps means roughly thirty re-reads of the same context.

The real saving: DeepSeek’s automatic context caching prices cached input at $0.004/M versus $0.44/M uncached. That’s a 110x reduction on the tokens that dominate an agent loop – not 17x. The headline number undersells the architecture, and it’s the line item tutorials skip.

Cost lane	Anthropic Claude Sonnet/Opus tier	DeepSeek V4 Pro
Output tokens	~$15/M (as of mid-2026)	$0.87/M (promo until May 31, 2026)
Input tokens (uncached)	~$3/M	$0.435/M
Input tokens (cached, repeated agent context)	discounted	$0.004/M

If you’re doing long autonomous loops on the same codebase, the cached-input rate is the line item that ends up dominating your bill.

Mid-session switching and the proxy trick

The proxy runs on localhost:3200 and intercepts all API calls. A control endpoint (/_proxy/mode) lets you switch the active backend without restarting the session.

Run V4 Pro for the bulk of a refactor. Hit a genuinely hard reasoning step. Flip to Anthropic Opus for that one prompt, then flip back – from inside the Claude Code session itself.

Pro tip: Define a slash command in your .claude/ config that fires curl -sX POST http://127.0.0.1:3200/_proxy/mode -d "backend=anthropic" when you type /opus. Cheap V4 Pro by default, one keystroke to escalate. The README hints at this but doesn’t spell it out.

The “flip back to Opus for hard tasks” reflex may be more habit than necessity. V4 Pro scores 96.4% on LiveCodeBench (per the DeepClaude README) and matched GPT-5.2 within 3% on FoodTruck Bench – a 30-day agentic business simulation – at approximately 17x lower API cost, per Startup Fortune’s analysis. Test before you assume Opus is needed.

What breaks quietly

Things the README admits, things it doesn’t, and things you’ll only notice after a few sessions:

No vision. Image input doesn’t work. DeepSeek’s Anthropic-compatible endpoint doesn’t support vision. If your workflow involves screenshots, this is a hard stop.
Parallel tool calls disabled. Tasks that Claude would have parallelized (read 5 files at once) now run serially. Token cost drops, wall-clock time goes up. You’ll feel it on long refactors.
No MCP pass-through. MCP server integrations don’t pass through. If you’ve built your workflow around MCP tools, half your stack is gone.
Hallucination on unknowns. Per DeepInfra’s overview, V4 Pro has a 94% hallucination rate on the AA-Omniscience benchmark – when it doesn’t know an answer, it almost always responds anyway rather than abstaining. In an autonomous loop making 30+ decisions, that compounds. Watch for confident-sounding tool calls against APIs that don’t exist.
Privacy. DeepSeek’s API doesn’t let you opt out of training data collection. Route through OpenRouter (deepclaude --backend or) with data collection denied, or self-host the open weights, before touching confidential code.
Promotional pricing has an expiry. The OpenRouter rate doubles after May 31, 2026 – verify before relying on this, the date may shift.

On truly hard reasoning tasks, the project itself concedes Opus still wins. Worth knowing before you commit an entire codebase to autonomous V4 Pro refactoring.

FAQ

Is DeepClaude safe to run on a production codebase?

Not by default. DeepSeek trains on your inputs unless you route through OpenRouter with data collection denied – or self-host the weights. Never paste keys, secrets, or proprietary logic into the direct DeepSeek API.

What happens if I already have ANTHROPIC_API_KEY set globally?

The wrapper overrides your env vars only for the duration of the deepclaude session, then restores the originals on exit. A separate claude terminal launched mid-session still routes to Anthropic – the override is per-process, not system-wide. That said, verify this with deepclaude --status before you trust it on a sensitive repo. The per-process scoping is the expected behavior based on how the script works, but edge cases exist if your shell sources the same profile in subshells.

Should I use V4 Pro or V4 Flash for subagents?

Flash. Subagents do narrow, well-defined work – search a codebase, extract a type signature, summarize a file. Flash handles that fine. The orchestrator is where the reasoning that decides what to do next actually lives – keep Pro there. One counterintuitive case: if your subagents are making judgment calls on ambiguous code (e.g., “is this function safe to delete?”), Flash’s weaker reasoning shows up fast. In that scenario, override CLAUDE_CODE_SUBAGENT_MODEL to Pro and accept the extra cost on those specific tasks.

Next step: grab a non-confidential side project, set up Path B (the env-var route, four lines), run one full agent loop on it, and check the cached vs uncached input ratio in your DeepSeek dashboard. That single number will tell you whether DeepClaude is worth wiring into your daily workflow.