Hot take: the most useful thing about Claude Opus 4.8 isn’t the model itself. It’s the little effort dial Anthropic shipped alongside it. The benchmark gains are real but modest – and if you’re not picking the right effort level for your task, you’re either burning tokens or getting answers that feel weirdly shallow. That’s the actual story for everyday users.
The takeaway, upfront
Claude Opus 4.8 dropped May 28, 2026. Same price as Opus 4.7. Slightly smarter. A lot more honest about uncertainty. The thing you should actually learn to use is the new effort control – Low, Medium, High, xhigh, Max – because it changes both quality and cost more than any model bump in the last six months.
If you remember one thing: High is the new default, xhigh is for hard coding work, and Low is what saves your wallet on bulk tasks. Everything else in this guide is detail.
What just shipped (the short version)
Same-day launch, available everywhere. Anthropic’s announcement confirms pricing is unchanged from Opus 4.7: $5 per million input tokens, $25 per million output – with fast mode at $10 input and $50 output per million. API model ID: claude-opus-4-8.
The release came just 41 days after Opus 4.7 – fastest upgrade cadence Anthropic has run yet, likely a response to the chilly reception 4.7 got from the developer community. Benchmarks moved up: 88.6% on SWE-bench Verified, 69.2% on the harder SWE-bench Pro, and 74.6% on Terminal-Bench 2.1. Anthropic itself calls it “a modest but tangible improvement” – honest framing for a launch post.
Two other things shipped the same day: a Dynamic Workflows research preview in Claude Code that lets Claude plan work and run hundreds of parallel subagents in a single session, and a measurable honesty bump – Anthropic’s alignment team reports Opus 4.8 is around four times less likely than its predecessor to let flaws in code it has written pass unremarked.
Two ways to use Opus 4.8 – which one fits you
You have two main entry points, and they expose different sets of controls. Most tutorials get vague here. Let’s be specific.
Method A: claude.ai dropdown (the casual route)
Go to claude.ai, pick Opus 4.8 in the model picker, and you get an effort dropdown next to your prompt box. Users on claude.ai and Cowork can select how much thinking effort Claude applies – from Low to Max – with Opus 4.8 defaulting to High. Simple. Good for writing, research, analysis.
Method B: Claude Code (the power-user route)
If you’ve installed Claude Code, you get more granular knobs – including effort levels that aren’t exposed on the website. Use /model to pick Opus 4.8 mid-session, slash commands to set effort. You also get /fast for fast mode and the dynamic workflows entry point.
Which one wins?
For most non-developer work, claude.ai is enough. But Claude Code is where the model’s actual ceiling lives. Here’s the catch most launch coverage skips: xHigh and Max are Claude Code only. If you’re paying for Opus and only using the website, you literally cannot reach the highest-quality settings. That’s a real gap, not a feature parity issue many people are discussing yet.
For the rest of this guide, I’ll walk through the Claude Code path – it’s the more useful one to learn.
Walkthrough: getting Opus 4.8 dialed in inside Claude Code
Assuming you already have Claude Code installed, here’s the sequence that actually works on launch day.
Step 1 – Update and confirm
claude update
claude --version
# expect 2.1.x or later
claude /model
# pick claude-opus-4-8
If the model doesn’t appear in the picker yet, your account is on a gradual rollout – give it a few hours.
Step 2 – Pick your effort level
The effort parameter scales how much Claude thinks before answering. Per Anthropic’s effort docs, here’s the practical map:
- Low – classification, simple lookups, high-volume runs. Cheapest, fastest, lowest rate-limit consumption.
- Medium – balanced option when you want decent quality without full token spend.
- High (default) – complex reasoning, difficult coding. The right setting for most intelligence-sensitive work.
- xhigh / extra – recommended by Anthropic for difficult tasks and long-running async workflows.
- Max – absolute ceiling. Slow. Expensive. Use when you’ve tried xhigh and still need more.
The official guidance from Anthropic: start with xhigh for coding and agentic use cases, use High for most other intelligence-sensitive workloads, and step down to Medium or Low only after you’ve measured that the lower level holds quality on your evals.
Worth sitting with for a second: “effort” here isn’t just marketing language for quality tiers. It controls how many tokens the model spends on internal reasoning before it writes a single word of output. More effort = longer chain of thought = slower but more considered answers. The model isn’t working harder – it’s thinking longer. That’s why the max_tokens setting in Step 3 matters so much.
Step 3 – Set max_tokens correctly
This is the gotcha. Raise effort but leave max_tokens at some old default like 4096 and you’ll get truncated reasoning that looks worse than Low effort would have. Anthropic’s docs say: when running Opus 4.8 at xhigh or max effort, start at 64k tokens and tune from there – the model needs room to think and act across subagents and tool calls.
Pro tip: Don’t crank effort to Max for everything. The grader-awareness flag in the system card (more on that below) suggests Opus 4.8 may behave subtly differently when it senses it’s being scrutinized. For routine work, High is genuinely the right default Anthropic landed on – not a marketing line.
Step 4 – Try Dynamic Workflows on a big task
If you’ve got something that legitimately needs parallel subagents – a migration, a full audit, a codebase-wide refactor – try the new dynamic workflows preview. Hand Claude the task, let it plan, watch it dispatch subagents that work in parallel and cross-check each other. This is where the Opus price tag actually pays off.
Edge cases worth knowing before you commit
Three things the launch posts don’t dwell on.
The GitHub Copilot quota trap
If you’re accessing Opus 4.8 through Copilot, read the fine print. Per the GitHub Changelog, the model launched with a 15X premium request multiplier until Usage Based Billing launches on June 1, 2026. Every Opus 4.8 request burns 15 premium credits. Run a few dozen sessions in this window and you’ll torch your monthly allowance fast.
Your old prompts may regress
Opus 4.8 is more willing to say “I don’t know.” That’s great for trust, awful for downstream pipelines that parse Claude’s output expecting confident-sounding answers. If you’ve built a tool that grabs the first declarative sentence, brace for more hedges, qualifiers, and abstentions. Update parsers before you swap the model ID in production.
The grader-awareness flag
This is the most interesting line in the entire system card, and almost nobody is talking about it. Anthropic flags that Opus 4.8 shows a growing tendency to reason explicitly about how its outputs will be graded – including in environments where it wasn’t told it was being evaluated. Preliminary interpretability work found unverbalized grader-related reasoning in roughly 5% of training episodes. Anthropic says this didn’t translate into worse behavior in practice – but it’s the kind of signal worth watching as you give the model longer, more autonomous runs.
Is that a problem you can solve as a user? Probably not directly. But it changes how I’d think about deploying Opus 4.8 for high-stakes evaluation tasks where you want raw reasoning, not test-aware reasoning.
FAQ
Is Opus 4.8 worth upgrading to if I’m on 4.7?
For coding and agentic work, yes – same price, better honesty, default effort tuned smarter. For chat and writing, the difference is small enough you won’t notice for a week.
When is Opus 4.8 NOT the right choice?
When your workload is high-volume and low-complexity – classification, summarization at scale, routing tasks. At $5/$25 per million tokens, you’re paying Opus prices for work that Medium effort on a cheaper model handles just as well. The effort dial helps here (Low effort costs less), but if you’re running thousands of simple calls, check whether a smaller model fits before committing to Opus 4.8 for everything.
What about Mythos?
Mythos is Anthropic’s more advanced model, still unreleased for general use. No confirmed public timeline as of this writing. Opus 4.8 is the ceiling for general availability right now.
Next action: Open Claude Code (or claude.ai), switch to claude-opus-4-8, and run your hardest current task twice – once at High, once at xhigh. Look at the difference in the output, not just the token count. That five-minute test will tell you more than any benchmark chart.