Anthropic released Claude Sonnet 5. In benchmarks, it closes in on the larger Opus 4.8 and even beats it in some areas. The model is available now at an introductory price.

Anthropic calls it the most agentic Sonnet yet: it can build plans, grab tools like browsers and terminals, and work on its own at a level that just months ago only bigger, pricier models could pull off, according to the company. Sonnet 5 is meant to close that gap.

Benchmarks show a clear jump over Sonnet 4.6

Anthropic's published benchmarks show Sonnet 5 beating its predecessor Sonnet 4.6 in every tested category while gaining ground on the pricier Opus 4.8. On agentic coding, Sonnet 5 hits 63.2 percent on SWE-bench Pro, up from 58.1 percent for Sonnet 4.6. Opus 4.8 sits at 69.2 percent. On Terminal-Bench 2.1, Sonnet 5 pulls 80.4 percent versus Sonnet 4.6's 67.0 percent. For multidisciplinary reasoning (Humanity's Last Exam), the model reaches 57.4 percent with tools, nearly matching Opus 4.8 at 57.9 percent. On computer use (OSWorld-Verified), Sonnet 5 posts 81.2 percent compared to 78.5 percent for its predecessor.

Sonnet 5 beats its predecessor, Sonnet 4.6, across every tested category and closes in on the pricier Opus 4.8. On knowledge work (GDPval-AA v2), Sonnet 5 even edges past Opus 4.8 with 1,618 points versus 1,615. | Image: Anthropic