Claude Sonnet 5: What "Most Agentic" Actually Means in Practice

Anthropic shipped Claude Sonnet 5 on June 30. Their framing: "the most agentic Sonnet model yet." If you've been following the Sonnet line, that claim has a specific context worth unpacking.

Sonnet 3.5 was the first version that made developers sit up and pay attention to tool use and coding. 3.6 and 3.7 kept pushing in that direction. 4.6 made a noticeable jump in agentic performance. But for the past several months, the most impressive agentic gains have been concentrated in the Opus tier. Sonnet felt capable but not quite there. Sonnet 5 is Anthropic's attempt to bring Opus-level autonomous execution down to Sonnet pricing.

The numbers tell part of the story. Sonnet 5 approaches Opus 4.8 on BrowseComp (autonomous web search evaluation) and OSWorld-Verified (computer use evaluation). Opus 4.8 remains Anthropic's ceiling for general capability. API pricing launches at $2 input / $10 output per million tokens, moving to $3/$15 after August 31. On safety, Sonnet 5 outperforms 4.6 at refusing malicious requests and resisting prompt injection attacks. Hallucination and sycophancy rates are lower. Anthropic also ran cybersecurity-specific tests (developing Firefox browser exploits) and found Sonnet 5 never completed a full working exploit, showing a clear gap versus Opus 4.8 and Mythos 5. They attribute this to not training on cybersecurity tasks rather than deliberate limitation.

Anthropic shipped Claude Sonnet 5 on June 30. Their framing: "the most agentic Sonnet model yet." If you've been following the Sonnet line, that claim has a specific context worth unpacking.

Claude Sonnet 5: What "Most Agentic" Actually Means in Practice

Claude Sonnet 5: What "Most Agentic" Actually Means in Practice

Other newsrooms on this story

Related reading

Claude Sonnet 5 Just Made Running Agents Cheap — What Builders Actually Need to…

Anthropic Debuts Claude Sonnet 5 As Agentic AI Push Goes Mainstream

Introducing Claude Sonnet 5

Anthropic upgrades Claude with new Sonnet 5 model, details here - 9to5Mac

Anthropic Claude Sonnet 5 vs Sonnet 4.6 vs Opus 4.8: Agentic Coding Benchmarks,…

Claude Sonnet 5 boosts coding, reasoning, and tool use

Other newsrooms on this story

Related reading

Claude Sonnet 5 Just Made Running Agents Cheap — What Builders Actually Need to…

Anthropic Debuts Claude Sonnet 5 As Agentic AI Push Goes Mainstream

Introducing Claude Sonnet 5

Anthropic upgrades Claude with new Sonnet 5 model, details here - 9to5Mac

Anthropic Claude Sonnet 5 vs Sonnet 4.6 vs Opus 4.8: Agentic Coding Benchmarks,…

Claude Sonnet 5 boosts coding, reasoning, and tool use