The Friday before a long weekend, I asked an agent to migrate a legacy webhook handler while I closed my laptop. It came back with a diff that compiled, ran the tests, and left a note about a fixture it did not want to change without me. That is the shape of the work these agents are pitched for now, and it is the shape GitLab is aiming at with the arrival of Anthropic's Claude Sonnet 5 on the Duo Agent Platform.
What actually landed
GitLab has added Claude Sonnet 5 to Duo Agent Platform across all tiers and every deployment model the platform supports, routed through GitLab's AI Gateway. GitLab positions the model for the kind of work agents already carry inside a CI/CD loop: multi-step tasks, code that holds up under review, and workflows the vendor is willing to call affordable at scale.
The number GitLab wants you to notice is a benchmark one. Sonnet 5 is the first model in GitLab's own evaluation suite to complete all of its benchmark tasks. Its predecessor, Sonnet 4.6, completed 93.8% of them. Read that carefully, because it is GitLab's benchmark, not yours, and benchmarks are a floor, not a ceiling.
What the AI Gateway hop actually buys you















