We run a studio where AI agents work mostly unattended — they write code, ship sites, produce content, and keep going without a human in the loop. Running agents like that, around the clock, teaches you one thing fast: the bill is the product constraint. Not the model's intelligence. The bill.

Here's the most expensive lesson we paid for, and the architecture we rebuilt to stop paying it.

The 136M-token fire

One of our agents burned ~136 million tokens in a stretch where it produced almost nothing. We assumed runaway tool calls. It wasn't.

The cause was mundane and brutal: the agent was waking itself on a timer (a cron / scheduled self-invoke) into one ever-growing session. Two things compounded: