Claude Opus 4.8 shipped today. The benchmarks are a distraction — here is what actually changes about how your agents run tomorrow.

Anthropic announced Claude Opus 4.8 at 16:00 UTC on June 3, 2026. The launch post leads with the usual benchmark deltas: SWE-bench Verified up 4.1 points, GPQA Diamond up 2.9, TAU-bench tool-use up 6.4. There is a chart. There is a marketing line about "the most capable agentic model we have ever shipped." If you stop reading there, you will miss the three things that will change how your production agents behave starting tomorrow.

I have spent the morning re-running our internal agent harness against Opus 4.8 and reading the model card line by line. Two of the three changes are improvements. One of them is a silent regression that will bite anyone who pinned the model ID. Here is the full picture.

What 4.8 actually changes

The model card and release notes ship three changes that the launch blog post does not foreground: