We recently announced that Sentry acquired XcodeBuildMCP, the Model Context Protocol server I built to help AI agents navigate iOS development. One of the first questions we were asked was an uncomfortable one: is an MCP actually necessary? We’re engineers building developer tools for engineers, so we did what felt natural and set out to answer it empirically.
We built and ran an eval that measured three LLMs, against three approaches, each tasked with five different coding exercises totaling 1,350 trials to find out. We expected XcodeBuildMCP to dominate, but it didn’t.
All three approaches we tested hit 99%+ success. Modern models recover from errors well enough that finishing the task is basically guaranteed. What surprised us was where the real differences showed up: time, cost, and how each approach spends its context budget.
The Context Paradox
MCP tools inject schemas, descriptions, and boilerplate into context before the agent does anything. That’s useful for tool access, but context isn’t free.






