Anthropic shipped Claude Sonnet 5 on June 30. Their framing: "the most agentic Sonnet model yet." If you've been following the Sonnet line, that claim has a specific context worth unpacking.
Sonnet 3.5 was the first version that made developers sit up and pay attention to tool use and coding. 3.6 and 3.7 kept pushing in that direction. 4.6 made a noticeable jump in agentic performance. But for the past several months, the most impressive agentic gains have been concentrated in the Opus tier. Sonnet felt capable but not quite there. Sonnet 5 is Anthropic's attempt to bring Opus-level autonomous execution down to Sonnet pricing.
The numbers tell part of the story. Sonnet 5 approaches Opus 4.8 on BrowseComp (autonomous web search evaluation) and OSWorld-Verified (computer use evaluation). Opus 4.8 remains Anthropic's ceiling for general capability. API pricing launches at $2 input / $10 output per million tokens, moving to $3/$15 after August 31. On safety, Sonnet 5 outperforms 4.6 at refusing malicious requests and resisting prompt injection attacks. Hallucination and sycophancy rates are lower. Anthropic also ran cybersecurity-specific tests (developing Firefox browser exploits) and found Sonnet 5 never completed a full working exploit, showing a clear gap versus Opus 4.8 and Mythos 5. They attribute this to not training on cybersecurity tasks rather than deliberate limitation.











