Claude Keeps Saying You Hit the Limit: What Changed & How to Fix It

Claude's usage limits got brutally tighter in 2026 - Pro users burn through sessions in hours, not days. Here's what changed, why it's happening, and 7 tested ways to actually stretch your quota.

Jack Tom2026-04-168 min readBeginner

You ask Claude a simple question about used cars. Limit hit. Halfway through debugging a function. Limit hit. Two prompts into your morning and Claude tells you to come back in five hours.

If you’re on Claude right now – especially during weekday mornings – you’re not imagining it. Limits tightened. Hard.

What changed, why Anthropic did it, and the workarounds that actually work (including one fix buried in GitHub that most articles won’t tell you about).

What Broke: The February-March-April Triple Hit

Three changes landed between February and April 2026. Each one squeezed users. Together, they made Claude unusable for many paying customers.

February 9, 2026: Anthropic shifted Opus 4.6 to “adaptive thinking” by default, letting the model decide how much reasoning to apply instead of using a fixed budget (as announced via official channels). Translation: less thinking on tasks Anthropic deemed “simple.”

March 3, 2026:Anthropic reduced the default effort level to “medium” (level 85), as Boris Cherny (Claude Code lead) confirmed on X. No prominent announcement. Users just noticed Claude getting… dumber. Shorter answers. More mistakes. The community started calling it “nerfing.”

March 26, 2026: Peak-hour session limits changed. During weekdays from 5am to 11pm Pacific time, your 5-hour session limit now depletes faster than actual clock time – Thariq Shihipar (Anthropic technical staff) announced this on X. You might burn through it in 90 minutes of real work.

Outages followed. Intermittent service disruptions hit in March and early April 2026, tied to surging demand after major model releases (per Anthropic’s status page and Downdetector reports). April 15: Claude went down hard – over 30,000 users reported errors at the peak.

Why This Happened

Anthropic’s official line: “managing growing demand.” The actual problem is simpler and harder to fix.

Late February 2026: OpenAI signed a Pentagon contract. ChatGPT uninstalls spiked 295% in one day (per TechCrunch and community reports). Claude hit #1 on the US App Store for the first time. Anthropic’s web traffic jumped over 30% month-over-month. Millions of new users hit the same GPU infrastructure overnight.

The cost trap: Claude Pro is $20/month (as of April 2026, per Anthropic’s pricing page). Max 5x is $100/month. These are flat subscriptions. Inference costs – the actual compute Anthropic pays per message – scale with every token you generate. There’s widespread speculation that Anthropic may be running short of computing resources after adoption soared, and unlike OpenAI, Anthropic hasn’t announced as many multibillion-dollar data center deals.

So they throttled. Not by cutting your monthly quota (that’d look bad), but by making it drain faster during peak hours and reducing how hard Claude thinks by default. Math works out for Anthropic. For users? Not great.

The Cache Bug No One’s Talking About

Buried in GitHub issues and Reddit threads, but matters if you use Claude Code.

A user reverse-engineered the Claude Code binary and found two independent bugs that cause the prompt cache to break silently, inflating costs by 10-20x (as discussed in GitHub issues and Reddit r/ClaudeCode). Cache breaks, Claude re-processes your entire codebase on every request instead of reading from cache. You pay full price every time.

Some users confirmed that downgrading to Claude Code version 2.1.34 made a very noticeable difference (community reports from GitHub and Reddit). Max subscriber burning through your weekly limit in a day? Try this before paying for overages.

How to downgrade Claude Code:

# macOS/Linux
brew uninstall claude-code
brew install [email protected]

# Or download the specific version from Anthropic's release archive

No guarantees, but enough users reported relief that it’s worth testing.

The cache lifetime thing: default is 5 minutes (per Anthropic’s official prompt caching documentation). Take a coffee break during a coding session and come back 6 minutes later? Cache is dead. You can extend it to 1 hour, but 1-hour cache write tokens cost 2x the base input price. Frequent cache misses or higher write costs – choose.

The /effort Command They Don’t Tell You About

Remember that March 3 change where Anthropic lowered default effort to “medium”? There’s a way to override it – but only in Claude Code, and only if you know the command exists.

Type /effort high in Claude Code terminal sessions to manually switch to higher reasoning depth (Boris Cherny confirmed this, and users discovered it through community sharing). Not in the UI. Not in the official help docs. Users discovered it through trial and error.

Burns more tokens? Yes. Makes Claude actually solve the problem instead of half-assing it? Yes. If you’re on Max and hitting limits anyway, you might as well get the smart version of Claude while you’re paying for it.

6 More Ways to Stretch Your Limits

1. Work off-peak. Avoid weekdays 5am-11am PT (as stated by Thariq Shihipar on X, March 26, 2026). During that window, your 5-hour session burns faster than real time. Schedule heavy tasks for evenings or weekends when limits reset to normal.

2. Edit your last prompt instead of sending corrections. Click the pencil icon on your original message, fix the prompt, and regenerate. The old exchange gets replaced instead of added, so Claude isn’t re-reading a growing pile of mistakes.

3. Start fresh chats more often. Claude re-reads your entire conversation history with every new message (community-documented behavior backed by Anthropic’s caching documentation). By message 30, Claude is processing 29 previous exchanges before it even starts on your new question. That’s where your tokens go. Switch topics? New chat.

4. Use Projects to cache recurring files. Uploading the same document in multiple chats? Claude re-counts those tokens every time. Upload files to a Project once and they get cached so they don’t cost you repeatedly.

5. Turn off features you’re not using. Web search, Research mode, extended thinking – they all add tokens to responses even when you don’t need them. Leave them off by default and flip them on only when required.

6. Drop to Haiku for simple tasks. The bigger the model, the more tokens it costs per message. Using Opus for “summarize this email” is overkill. Haiku is fast and cheap. Save Opus for the hard stuff.

When You Should NOT Try to Optimize

Max 20x subscriber and you’re still hitting limits after applying these tricks? You’re not the problem. The pricing model is.

Max 20x is $200/month (as of April 2026, per Anthropic’s pricing page). Some paying Pro users report using free alternatives like DeepSeek and Z.ai (GLM) more than Claude lately because they’ve stopped touching Opus entirely – it drains weekly quotas too fast to be practical.

At that point, you’re either paying API overage rates (which defeats the purpose of a subscription) or you’re switching tools. One user was paying $200/month for Max specifically to use Claude with third-party tools like OpenClaw. After Anthropic blocked OAuth tokens post-Sonnet 4.6, those workflows broke. The subscription became useless for the use case they were paying for.

This isn’t a you problem. It’s a capacity problem Anthropic hasn’t solved yet.

What Happens Next

Anthropic was valued at $380 billion as of February 2026 and is reportedly preparing for an IPO (per Fortune and Bloomberg). User dissatisfaction with Claude’s sudden performance decline and anger at perceived lack of transparency could derail the company’s growth just as it’s trying to attract investors.

Two options: raise subscription prices to match actual costs, or scale infrastructure fast enough to meet demand. Neither is easy. Raising prices risks losing the users they just gained from the ChatGPT exodus. Scaling data centers takes quarters, not weeks.

In the meantime, you’re stuck with tighter limits and a model that thinks less by default.

Try the cache fix. Use /effort high. Work off-peak. Burning $200/month and still hitting walls? That’s a signal. Not that you’re using it wrong – the pricing model doesn’t match the product anymore.

Frequently Asked Questions

Why does Claude say I hit my limit after only 2 messages?

Free users can hit their limit within two prompts, triggering a 5-hour cooldown (as reported by multiple users). Uploaded files or used extended thinking? Both consume significantly more tokens than text-only prompts. Peak hours (5am-11am PT on weekdays) make it worse.

Can I still use my Claude Max subscription with third-party coding tools?

Officially, no. Anthropic’s updated documentation (as of April 2026) states that OAuth tokens from Claude Free, Pro, and Max accounts cannot be used in third-party tools. After the Sonnet 4.6 release, Anthropic began enforcing restrictions, and many users found their tokens no longer authenticated. Some tokens still work, but using them violates the consumer terms of service. Third-party tool access critical to your workflow? You’ll need to switch to the official API with pay-as-you-go pricing. One user paying $200/month for Max had their OpenClaw workflow break entirely when the OAuth block hit – the subscription became useless for the exact use case they were paying for.

Is the Claude Code token drain a bug or intentional throttling?

Both. Anthropic acknowledged on March 31, 2026, that ‘people are hitting usage limits in Claude Code way faster than expected’ and called it ‘the top priority for the team’. Multiple factors: deliberate peak-hour throttling (confirmed by Thariq Shihipar), a March promotion ending (confirmed), and suspected cache-breaking bugs in recent Claude Code versions (user-reported, not officially confirmed). The downgrade-to-2.1.34 fix suggests at least some of the drain is a bug Anthropic hasn’t patched yet.