Claude Opus 4.8 is here. Is it as good as they say?

I got a few hours of early-access testing with Anthropic’s newly released model Opus 4.8. I walk through real coding, design, and strategy tasks across Claude Code and Claude Cowork, and give you my unfiltered view on what impressed me and what didn’t.Where Opus 4.8 excels: greenfield prototypes, one-shot features, and fast executionWhere it struggles: the last 10%, edge cases in existing codebases, and hallucinationsHow Opus 4.8 compares to Opus 4.7 on business strategy workWhy I’m still reaching for Opus 4.7 on data-heavy strategy and roadmap workThe new features shipping alongside the model: dynamic workflows with parallel subagents and effort control in Claude.ai and CoworkThe prompting and harness strategy I’d use to get the most out of it(00:00) Introduction to Opus 4.8(00:44) Benchmark performance and pricing(01:53) First coding test: Building a prototyping tool(03:00) Where it failed: The last 10% problem(03:27) The hallucination problem(04:23) Testing Opus 4.8 on existing codebases(05:24) The ambition test: Building games for a 9-year-old(07:03) Business strategy test: 4.7 vs 4.8(08:23) The roadmap test(09:17) Final verdict• System Card: Claude Opus 4.8: https://cdn.sanity.io/files/4zrzovbb/website/c886650a2e96fc0925c805a1a7ca77314ccbf4a6.pdf• Introducing Claude Opus 4.8 on X: Claude@claudeaiIntroducing Claude Opus 4.8: it builds on Opus 4.7 with sharper judgment, more honesty about its own progress, and the ability to work independently for longer than its predecessors.

Claude Opus 4.8 is here. Is it as good as they say?

Claude Opus 4.8 is here. Is it as good as they say?

Other newsrooms on this story

Related reading

Introducing Claude Opus 4.8

Claude Opus 4.8 Is Here: Benchmarks, Dynamic Workflows, and Whether to Upgrade…

Claude Opus 4.8: What Developers Need to Know About Anthropic's New Flagship

Introducing Claude Opus 4.7

Opus 4.8 scored 81 in my benchmark. I still wouldn't default to it. (The full…

Claude Opus 4.8 shipped today. Here is what the launch post does not say about…

Other newsrooms on this story

Related reading

Introducing Claude Opus 4.8

Claude Opus 4.8 Is Here: Benchmarks, Dynamic Workflows, and Whether to Upgrade…

Claude Opus 4.8: What Developers Need to Know About Anthropic's New Flagship

Introducing Claude Opus 4.7

Opus 4.8 scored 81 in my benchmark. I still wouldn't default to it. (The full…

Claude Opus 4.8 shipped today. Here is what the launch post does not say about…