BAGEN: LLM Agents Waste 44% of Tokens on Tasks They'll Fail

You're paying for every token your agent burns. And according to new research from Northwestern, Stanford, Cornell, and All Hands AI, a large share of that spend goes directly to waste — on trajectories the agent was never going to complete successfully.

The paper is BAGEN: Are LLM Agents Budget-Aware? (arXiv:2606.00198, submitted May 29, 2026). Its core question is simple: can frontier LLM agents predict when they're about to run out of runway? The answer, across five frontier models and four environments, is a firm no.

This article covers what BAGEN found, how the concept of budget-aware interval estimation works, and includes an Effloow Lab PoC that reproduces the key dynamics using Python stdlib — no API keys, no GPU.

Why This Matters for Production Agent Systems

Token budgets are a real constraint in every deployed agent system. You set a max_tokens limit, you watch the cost dashboard, and you assume the agent will either finish the task or hit the hard wall. What BAGEN documents is a third case that developers rarely account for: the agent continues consuming tokens on a task it cannot complete, all the way to the limit.

Why This Matters for Production Agent Systems

BAGEN: LLM Agents Waste 44% of Tokens on Tasks They'll Fail

Other newsrooms on this story

BAGEN: LLM Agents Waste 44% of Tokens on Tasks They'll Fail

Other newsrooms on this story

Related reading

Your LLM Bill Is Exploding Because of Architecture, Not Pricing -- Here's the…

Headroom: Cut Your LLM Token Usage by Up to 95% Without Changing Your Answers

Stop Using LLMs to Audit Other LLMs: You Are Bricking Your Production Latency

Standard Benchmarks Fail -- Auditing LLM Agents in Finance Must Prioritize Risk

12 Engineering Habits That Cut LLM Token Spend at Production Scale

LLM agent memory at 0.12% of model parameters

Related reading

Your LLM Bill Is Exploding Because of Architecture, Not Pricing -- Here's the…

Headroom: Cut Your LLM Token Usage by Up to 95% Without Changing Your Answers

Stop Using LLMs to Audit Other LLMs: You Are Bricking Your Production Latency

Standard Benchmarks Fail -- Auditing LLM Agents in Finance Must Prioritize Risk

12 Engineering Habits That Cut LLM Token Spend at Production Scale

LLM agent memory at 0.12% of model parameters