Claude Opus 4.8 vs Opus 4.7: What Actually Changed
Most of the spec sheet is identical — same 1M context, same $5/$25 pricing, same January 2026 cutoff. Here is what genuinely changed in Opus 4.8, and what the benchmark hype gets wrong.
Anthropic released Claude Opus 4.8 on May 28, 2026. If you opened the spec sheet expecting a leap, you would be forgiven for being confused: the context window, the price, and the knowledge cutoff are all identical to Opus 4.7. The upgrade lives somewhere less flashy and, if you actually run agents in production, somewhere more useful — reliability.
This is the honest breakdown of what changed between Opus 4.7 and 4.8, what did not, and which of the benchmark numbers floating around are real versus third-party guesses. We build on these models at Ascero AI, so this is the comparison we ran for ourselves before flipping the model ID.
What did NOT change (and why the headlines are confusing)
A lot of the early coverage implied Opus 4.8 added a million-token context window. It did not. Opus 4.7 already had one. Per Anthropic's own model documentation, these are identical across the two releases:
- Context window: 1M tokens on the Claude API, Amazon Bedrock, and Vertex AI (200k on Microsoft Foundry) — same on both 4.7 and 4.8.
- Standard pricing: $5 per million input tokens, $25 per million output tokens — identical.
- Knowledge cutoff: January 2026 for both. The cutoff is not a 4.8 differentiator.
- Max output: 128k tokens (up to 300k via the batch beta header) — same on both.
So if someone tells you to upgrade for the bigger context window or the fresher training data, they are selling you something that was already in 4.7.
What actually changed
The real story is agentic reliability — the unglamorous stuff that determines whether a long agent run finishes clean or quietly corrupts itself three steps in. Per Anthropic, Opus 4.8 brings:
- Around 4× less likely than Opus 4.7 to let flaws in code it wrote pass unremarked. This is Anthropic's own framing, and for anyone shipping agent-written code it is the headline.
- Better long-context handling with fewer compactions and better recovery when a compaction does happen.
- Better tool triggering — fewer skipped required tool calls, which was a genuine 4.7 pain point in agent loops.
- The effort parameter now defaults to high on all surfaces.
- Mid-conversation system messages are supported (a system turn after a user turn, no beta header) while preserving prompt-cache hits.
- A lower prompt-cache minimum (1,024 tokens), which makes caching pay off on shorter prompts.
Fast mode — the one genuinely new lever
Opus 4.8 adds a fast mode (research preview, Claude API only): set speed to "fast" and you get up to 2.5× higher output tokens per second from the same model, at premium pricing — $10 per million input, $50 per million output, double the standard rate. Anthropic notes it is roughly three times cheaper than fast output was on previous models. The use case is latency-sensitive agent loops and anything user-facing where the wait is the product, like a voice agent.
The benchmark claims — and the one number you should not trust
Anthropic's marketing page makes several agentic claims for 4.8, most of them qualitative: 84% on Online-Mind2Web (called a meaningful jump over both Opus 4.7 and GPT-5.5); the top score on its Legal Agent Benchmark, including being the first model to break 10% on the all-pass standard; gains over prior Opus models on CursorBench at every effort level; and being the only model to complete every case end-to-end on its Super-Agent benchmark. Read those as vendor benchmarks — useful direction, not gospel.
What it means if you build on Claude
For anyone running agents in production — us included — the 4× reduction in code-flaws-let-through and the better tool triggering matter more than a single benchmark point. Those are the failures that don't show up in a demo and do show up at 2am in a long unattended run. Fast mode is a real unlock for latency-sensitive flows. And because the price, context window, and cutoff are unchanged, migrating is about as low-risk as a model upgrade gets: swap claude-opus-4-7 for claude-opus-4-8 and watch your eval suite.
The honest bottom line: Opus 4.8 is a reliability release, not a capability leap. For a company betting on long-horizon autonomous agents, that is exactly the right kind of boring.
- Anthropic — Introducing Claude Opus 4.8 (official)
- Claude Platform Docs — What's new in Claude Opus 4.8
- Claude Platform Docs — Models overview (specs, pricing, cutoff)
- Anthropic — Claude Opus product page (benchmark claims)
- TechCrunch — Anthropic releases Opus 4.8 with new dynamic workflow tool
- 9to5Mac — Anthropic upgrades Claude with new Opus 4.8 model
- llm-stats — Opus 4.8 launch & benchmarks (third-party; SWE-bench figures unverified)
Ascero AI. “Claude Opus 4.8 vs Opus 4.7: What Actually Changed.” May 29, 2026. https://asceroai.com/news/claude-opus-4-8-vs-4-7-what-changed-2026
Free to reference with attribution and a link back to this page.