At the AI Engineer Summit 2025 in New York, the mantra that got repeated from stage after stage was four words. Capability does not mean reliability. Speakers from finance, infrastructure, and consumer products converged on the same point: shipping an agent that demos well is now a solved problem, and shipping one that survives a Tuesday in production is not.

The data backs the room. LangChain's State of Agent Engineering report found that 89 percent of organizations running agents in production have had to add observability that their framework did not give them. Sixty-two percent had to build detailed tracing for individual agent steps. Honeycomb's O11yCon 2026 was themed, in full, as the observability conference for the agent era. Three different angles on the same pattern. Teams that took an agent to production had to build half an orchestrator on top of their framework.

The pattern has a name now. Last month, Kaxil Naik and Pavan Kumar Gopidesu shipped the Common AI Provider for Apache Airflow 3 with a sentence that articulates what hundreds of teams had been intuiting: "Not a wrapper around another framework, but a provider package that plugs into the orchestrator you already run." Both work at Astronomer, the commercial backer of Airflow, which is worth naming up front. The sentence is a diagnosis whether it came from Astronomer or anyone else.