Claude Code Quality Drama: How 3 Hidden Bugs Fooled Everyone

Claude users reported degraded performance for weeks - Anthropic blamed bugs, not the model. Here's what broke, how to check if you're affected, and the fixes that work.

Jack Tom2026-04-238 min readBeginner

You updated Claude Code in early March. Within days, it felt dumber. Multi-file refactors that used to work started fabricating API versions. Code reviews missed obvious bugs. You asked it the same question twice and got contradictory answers.

Then you checked Reddit.

Turns out you weren’t hallucinating. Between March 4 and April 16, 2026, Anthropic shipped three separate changes that degraded Claude Code quality for users of Claude Code, the Agent SDK, and Cowork – but not the API. The catch? They never intentionally degraded the model. The bugs lived in the use – the product layer wrapping the model.

Here’s what broke, how to check if you’re still affected, and the settings that actually fix it.

The #1 Mistake: Trusting Defaults After Product Updates

Most developers assume model quality regressions mean the model itself changed. That’s not what happened here.

On March 4, Anthropic changed Claude Code’s default reasoning effort from high to medium to reduce UI latency. The result? Claude started thinking less. Users reported Claude felt less intelligent, but most retained the medium effort default even after Anthropic added UI notices and an inline effort selector.

Translation: if you installed Claude Code between March 4 and April 7, you got degraded intelligence by default – and changing models or updating the app didn’t fix it, because the setting lived in your local config.

Pro tip: After any Claude Code update, run claude --version and check ~/.claude/settings.json for effort overrides. If you see "effort": "medium" or any manual effort setting from March, delete it. Let the April 7+ defaults take over.

What Actually Happened: The Three Bugs

Anthropic’s postmortem breaks down into three distinct failures, each affecting different users at different times.

Bug 1: Effort Level Downgrade (March 4 – April 7)

When Opus 4.6 launched in February with high effort by default, users complained it would occasionally think for too long, causing the UI to appear frozen. Anthropic’s fix? Lower the default.

On April 7, they reversed this decision: all users now default to xhigh effort for Opus 4.7, and high effort for all other models.

Who was affected: Anyone using Claude Code between March 4 and April 7 who didn’t manually increase effort.

How to check:

claude --version # Should be v2.1.116 or later
cat ~/.claude/settings.json | grep effort

If you see any effort override from March, remove it.

Bug 2: The Cache Clearing Loop (March 26 – April 10)

This one’s gnarly.

Claude normally keeps reasoning history in conversation context so it can see why it made previous edits. On March 26, Anthropic shipped an efficiency improvement to clear old thinking from sessions idle for over an hour, using prompt caching to speed up resumed sessions.

The bug? Instead of clearing thinking once, it cleared it on every turn for the rest of the session. After a session crossed the idle threshold once, each request told the API to discard everything before the most recent reasoning block.

Worse: if you sent a follow-up while Claude was mid-tool-use, that started a new turn under the broken flag, so even current-turn reasoning was dropped. Claude would continue executing, but increasingly without memory of why it had chosen to do what it was doing.

This surfaced as forgetfulness, repetition, odd tool choices, and weirdly, faster-than-expected usage limit depletion.

Who was affected: Users with sessions that went idle for 1+ hour between March 26 and April 10. Sonnet 4.6 and Opus 4.6 only.

Fixed in:v2.1.101 on April 10.

Bug 3: Verbose System Prompt (April 16 – April 20)

On April 16, Anthropic added a system prompt instruction to reduce verbosity. In combination with other prompt changes, it hurt coding quality and was reverted on April 20. This impacted Sonnet 4.6, Opus 4.6, and Opus 4.7.

Four-day window. If you weren’t using Claude Code April 16-20, you missed this entirely.

How to Verify Your Current Setup

Don’t assume you’re running the fixed version. Check.

Step 1: Confirm version

claude --version

You want v2.1.116 or later. All three issues were resolved as of April 20 (v2.1.116).

Step 2: Check effort settings

cat ~/.claude/settings.json

Look for any effort key. If it’s set to medium or anything other than the model default, consider removing it unless you set it intentionally post-April 7.

Step 3: Test reasoning depth

Ask Claude Code to explain a moderately complex function in your codebase – something that requires reading 3-5 files to understand dependencies. If the explanation is shallow or skips context, your effort level may still be too low.

Try explicitly setting it higher:

/effort xhigh

Then repeat the same question. If the answer improves dramatically, your defaults aren’t kicking in.

The Fixes That Actually Work

Based on Anthropic’s guidance, community testing, and the postmortem.

Fix 1: Update to v2.1.116 or Later

This is non-negotiable. The cache bug and verbosity prompt are only fixed in the binary itself.

claude --update

Fix 2: Remove Stale Effort Overrides

If you have "effort": "medium" in ~/.claude/settings.json from March, delete it. Let the new defaults apply.

Current defaults as of April 7:

Opus 4.7: xhigh
All other models: high

Fix 3: Force Extended Thinking for Hard Tasks

For complex refactors, debugging, or multi-file changes, explicitly request higher effort:

/effort xhigh

Or set it at session start:

claude --effort xhigh

Fix 4: Clear Long-Running Sessions

If you’re still seeing forgetfulness in sessions that pre-date April 10, don’t resume them. Start fresh.

/clear

Old sessions may have accumulated corrupted thinking state from the cache bug.

The Part Nobody’s Talking About

Here’s what the official postmortem doesn’t emphasize: Because the cache bug continuously dropped thinking blocks from subsequent requests, those requests also resulted in cache misses, which Anthropic believes drove the separate reports of usage limits draining faster than expected.

You weren’t just getting worse answers. You were paying more for them.

That’s why Anthropic reset usage limits for all subscribers as of April 23 – partial compensation for token waste during the buggy period.

Another edge case: if you’re running a custom system prompt via CLAUDE.md or settings.json, you never experienced the verbosity regression OR the fix. Custom prompts bypass Anthropic’s defaults entirely, which means you were insulated from bug #3 but also from any improvements Anthropic ships at the prompt level.

Performance Now vs. Then

How does Claude Code perform post-fix?

On SWE-bench Verified, Claude Opus 4.6 now scores 80.8%, with Sonnet 4.6 at 79.6%. On Rakuten-SWE-Bench, Opus 4.7 resolves 3x more production tasks than Opus 4.6, with double-digit gains in code quality and test quality.

For CodeRabbit’s review workloads, Opus 4.7 is the sharpest model they’ve tested – recall improved by over 10%, surfacing difficult-to-detect bugs in complex PRs, while precision remained stable.

Vision also jumped. For XBOW’s autonomous penetration testing, Opus 4.7 hit 98.5% on visual-acuity benchmarks versus 54.5% for Opus 4.6 – their single biggest pain point effectively disappeared.

What Anthropic’s Changing to Prevent This

Trust erosion from incidents like this is costly. Here’s what they’re doing differently:

Internal dogfooding: a larger share of staff will use the exact public builds to experience the product as users do
Enhanced evaluation suites: broader per-model evals and ablations for every system prompt change
Tighter controls: new tooling to make prompt changes easier to audit, model-specific changes strictly gated
Better communication: a new @ClaudeDevs account on X and deeper reasoning in GitHub threads

The cache bug is instructive. It passed multiple human and automated code reviews, unit tests, end-to-end tests, and dogfooding. Combined with it only triggering in a corner case (stale sessions) and the difficulty of reproducing the issue, it took over a week to confirm the root cause.

Interestingly, when Anthropic back-tested their Code Review tool against the offending pull requests using Opus 4.7 and full repo context, Opus 4.7 found the bug while Opus 4.6 didn’t. They’re now adding support for additional repos as context for reviews.

Should You Still Use Claude Code?

Yes, if you’re on v2.1.116 or later.

The degradation was real, but it was product-layer bugs, not model regression. The fixes are live. Claude Pro’s $20/month subscription now includes Claude Code, which can autonomously read repos, write code, run tests, and iterate – functionality that would cost extra via OpenAI’s API beyond the base ChatGPT Plus subscription.

For developers, 70% now prefer Claude for coding tasks, citing superior multi-file handling, more accurate refactoring, and fewer hallucinated API calls compared to ChatGPT.

Just don’t assume defaults are always optimal. Check your config. Update regularly. And when quality suddenly drops, verify your version and settings before blaming the model.

Your Next Step

Run this now:

claude --version
cat ~/.claude/settings.json | grep effort

If you’re below v2.1.116 or see "effort": "medium", update and clean your config. Then test on a real task – something that requires reading multiple files and understanding dependencies.

If it still feels off, force /effort xhigh and compare. You’ll know immediately whether your setup is actually fixed.

FAQ

Was the Claude API affected by these bugs?

No. The API was not impacted. These bugs only affected Claude Code, the Agent SDK, and Claude Cowork – the product-layer tools built on top of the API.

I’m on v2.1.116 but still seeing repetitive answers. What’s wrong?

Two possibilities. First, check if you’re resuming a session that started before April 10. Sessions that accumulated corrupted thinking state from the cache bug may still exhibit odd behavior – use /clear to start fresh. Second, verify your effort level isn’t manually set to medium in settings. If neither applies, you might be hitting a different issue – file a report with Anthropic and include your session transcript.

Should I downgrade to v2.1.34 to avoid future bugs?

No. While some users confirmed downgrading to v2.1.34 dodged the cache issue, you’d also miss 20+ bug fixes, security patches, and feature improvements shipped between v2.1.34 and v2.1.116. The bugs that caused March-April degradation are fixed. Staying current is safer than running old builds.