The next AI budget fight will not start because employees refuse to use AI.It will start because they finally do.This is why the date matters. This is a June 2026 problem, not a December 2025 problem.In May 2026, Uber became one of the first big companies to make the new problem concrete: 95% of its engineers now use AI tools every month, most of them in agent-style workflows, and an internal coding agent writes roughly 1,800 code changes a week. Uber was not playing with chatbots. It was doing exactly what every board has been demanding: get serious about AI, put the tools into real workflows, find the leverage.Then the cost story broke. Uber’s CTO, Praveen Neppalli Naga, reportedly told people the company had blown through its entire 2026 AI budget months early. The easy read was that the tools cost too much and employees need reining in.I think that read is incomplete. The sharper signal came from Uber’s president and COO, Andrew Macdonald, who said the company can see the usage, the commits, and the token spend, and still cannot cleanly connect any of it to better features for customers.That is the real story, and it is bigger than Uber. The bill is the first hard evidence that AI has crossed from a tool you buy into labor you have to manage, and almost no company has built a system to manage labor it cannot see. Read correctly, token burn is not waste but information about a kind of work the company has not learned to run yet.Where you sit decides what the bill threatens. If you own the budget, it becomes the line item that justifies a layoff you did not want to make. If you run engineering, it becomes the cap that kills the experiments that were working. If you do the work, it turns “used too much AI” into a performance problem instead of a signal that you found a job worth automating. Same invoice, three warnings, one missing system.The companies that win this will not be the ones that spent the least or the most. Spending freely and capping hard are both easy, and both are wrong. The harder answer is the one in between, and the rest of this briefing is how you get there.This briefing covers:The real shape of the AI cost curve. Why the work you actually want from frontier models keeps getting more expensive even as the price per call falls, and what that does to next year’s budget.A routing rule for every AI dollar. One principle, minimum effective intelligence, for deciding when a job needs a frontier model, an open model, or no model at all.Why your 2025 budget model is the thing breaking. Seats and licenses cannot price work that plans, retries, and runs for hours, and a better dashboard will not save it.The operating model that replaces the token cap. What an agent-first company actually changes: work objects, gates, permissions, and the training that turns usage into compounding advantage.How to read your own token bill. A way to tell production from tuition from waste from the signal that you just found a workflow worth turning into infrastructure.The argument runs in seven parts, and it ends somewhere you can use: an operating model and a routing rule you can take into your next budget conversation. The setup is free. The system is below.
Executive Briefing: 95% Adoption, and You Still Can't Prove One Token Helped
Watch now | Token burn is more than a budget problem. It;s what happens when frontier intelligence gets useful, open models get good, and companies try to manage agentic work with 2025 controls.














