So you just opened Claude this morning and saw “Sonnet 4.6” where 4.5 used to be. Everyone’s tweeting benchmark numbers. Someone on Reddit swears it writes better code than Opus. You’re wondering: is this actually different, or is it just another 0.1 version bump?
Sonnet 4.6 is the first mid-tier model that genuinely competes with flagship-level intelligence. Costs $3/$15 per million tokens – five times cheaper than Opus – but developers preferred it to November’s Opus 4.5 in 59% of head-to-head tests (per Anthropic’s February 17, 2026 announcement).
That pricing flip? Changes the math for basically everyone using Claude.
The 1M context window nobody’s explaining
Beta-only. API-only. Cross 200K tokens and you’re paying premium rates – no warning, no toggle. Anthropic’s pricing docs say requests exceeding 200K input tokens get “automatically charged at premium long context rates.” The UI doesn’t tell you when you’ve crossed that line.
Default context window: still 200K. The 1M upgrade? You request it via API and pay more. If you’re using claude.ai in the browser, you’re capped at 200K.
This matters because Sonnet 4.6’s headline feature – massive context for entire codebases – is gated behind both API access and stealth pricing. You don’t “get” 1M tokens by default.
Think about it: every tutorial says “1M token support.” None mention the 200K cliff where pricing jumps. Your first big codebase ingest will teach you this the hard way.
When Sonnet 4.6 actually replaces Opus
Computer use: 72.5% on OSWorld-Verified benchmark (as of February 2026). That’s within 0.2% of Opus 4.6 (72.7%). Computer use means the model controls a mouse and keyboard to navigate software that doesn’t have an API – legacy ERP systems, insurance portals, anything built before 2015. In October 2024? Sonnet 3.5 scored 14.9% on this test. Sixteen months later: nearly 5x better.
Coding: 79.6% on SWE-bench Verified (real-world bug fixes), nearly matching Opus 4.6’s 80.8%. Early testers preferred Sonnet 4.6 over Sonnet 4.5 roughly 70% of the time – citing “better context reading” and “logic consolidation rather than duplicating it.”
Office tasks: scored 1633 on GDPval-AA Elo benchmark (spreadsheet work, document Q&A, data extraction) – higher than Opus 4.6’s 1606. Box ran tests on heavy reasoning tasks. Accuracy jumped from 62% (Sonnet 4.5) to 77% (Sonnet 4.6). That’s a 15-point gain.
Opus still wins: multi-step agent orchestration, codebase refactoring, problems where “getting it exactly right” beats speed or cost. Anthropic’s docs note Opus 4.6 “remains the strongest option for tasks that demand the deepest reasoning.”
Practical test: building a production agent that runs autonomously for hours? Opus is safer. Iterating on code, debugging, analyzing documents? Sonnet 4.6 handles it. You’ll spend 80% less.
What changed under the hood
Adaptive thinking. Previous models had a binary switch: extended thinking on or off. Sonnet 4.6? The model decides when deeper reasoning would help (as of February 2026). Default “high” effort level almost always thinks. Lower levels skip thinking for simple queries. You control this via the effort parameter in the API (low, medium, high, max). Web UI doesn’t expose this yet – API-only.
One catch: Anthropic’s announcement recommends changing effort via a /effort command in the UI. That command doesn’t exist. Community users on Hacker News confirmed it’s not implemented. Want granular control? Use the API.
Context compaction: long conversations used to hit the context limit and stop. Sonnet 4.6 introduces context compaction (beta, as of February 2026) – when you approach the limit, the API automatically summarizes older parts and replaces them. Effectively infinite conversations. No manual restarts. This is server-side – no setup if you’re using the API.
Prompt injection resistance improved dramatically. Computer use is powerful but risky – malicious actors hide instructions on websites to hijack the model (a “prompt injection attack”). Safety evals show Sonnet 4.6 performs similarly to Opus 4.6 in resisting these attacks. Building agents that browse the web autonomously? This hardening isn’t optional.
How to use it (and where you’ll get stuck)
Claude Free or Pro? Sonnet 4.6 became the default February 17, 2026. You don’t need to do anything – it’s already running.
API users: model identifier is claude-sonnet-4-6. Available via Anthropic’s API, Amazon Bedrock, Google Vertex AI, and GitHub Copilot.
Voice mode was silently downgraded. As of about 10 days ago, Claude routes ALL voice interactions to Haiku – not Sonnet, not Opus. Even for paid Max users. No announcement, no setting to change it back. MacRumors community member called this “an enormous error” given OpenAI’s voice upgrades. Rely on voice mode? You’re currently stuck with the weakest model.
Third-party tools lag behind official support. Sonnet 4.6 launched Feb 17, but tools like OpenClaw threw “Unknown model: anthropic/claude-sonnet-4-6” errors for the first day. GitHub Copilot started rolling it out to Pro+ users but required admins to manually enable it in settings. Integrating via a third-party platform? Expect a 24-72 hour delay.
Pro tip: Comparing Sonnet 4.6 to Opus for a specific task? Test it with
effort: mediumfirst. Anthropic recommends medium for “most use cases” to balance speed and quality. High/max effort slows responses and costs more tokens – save it for genuinely hard problems.
The pricing trap
“$3/$15 per million tokens” for Sonnet 4.6. True – if you stay under 200K input tokens.
Cross 200K? Automatically moved to premium long-context pricing (as of February 2026). Anthropic’s docs don’t publish the exact premium rate, but it’s higher than the advertised $3/$15. Check if your request triggered premium pricing by inspecting the usage object in the API response: if input_tokens exceeds 200K, you were billed at the higher rate.
Fine if you know it’s coming. Problem if you assume “1M token support” means “1M tokens at $3/$15.”
Opus 4.6 costs $15/$75 per million tokens at the base rate. Even with Sonnet’s premium pricing past 200K, you’re likely still paying less than Opus – but not 5x less.
Running high-volume agents or ingesting large codebases daily? Do the math on whether premium context pricing keeps you below Opus cost. Most users: it will. But “Sonnet is always cheaper” isn’t universally true once you’re in premium territory.
What it learned since 4.5
Knowledge cutoff: August 2025 for Sonnet 4.6 (per the system card). Opus 4.6 cuts off at May 2025. Haiku 4.5 stops at February 2025.
That three-month gap means Sonnet 4.6 knows about events, libraries, and APIs that Opus 4.6 doesn’t. Working with tools or frameworks released between May and August 2025? Sonnet has native knowledge. Opus doesn’t.
Small edge. But it flips the usual “flagship model knows more” assumption. Next few months: the mid-tier model is more up-to-date.
FAQ
Can I use Sonnet 4.6’s 1M context window on the free plan?
No. Beta-only, API access required (as of February 2026). Free and Pro plans on claude.ai default to 200K. Hitting the API directly? You can request the 1M window, but it costs more past 200K tokens.
Is Sonnet 4.6 actually better than Opus 4.5 for coding, or is that just marketing?
Depends on the task. Iterative debugging, refactoring, single-file edits? Early testers preferred Sonnet 4.6 to Opus 4.5 59% of the time (as of February 2026) – citing fewer hallucinations and better instruction following. Complex multi-agent orchestration or codebase-wide refactors? Opus 4.6 (not 4.5) is still stronger. The real win: Sonnet 4.6 handles most day-to-day coding at 1/5 the cost. You reach for Opus when Sonnet fails, not by default. One caveat: some users report Sonnet 4.6 still struggles with deeply nested logic or ambiguous requirements – Opus catches edge cases Sonnet misses. Test both on your actual codebase before committing.
Why does my third-party tool (Cursor, Cody, etc.) not support Sonnet 4.6 yet even though Anthropic released it days ago?
Integration lag. Anthropic updates their API immediately, but third-party tools need to add the new model identifier (claude-sonnet-4-6) to their backend (as of February 2026). GitHub Copilot took ~24 hours and required admins to enable it manually. Expect 1-3 days for most platforms. Need it immediately? Use Anthropic’s API or claude.ai directly.