I kill -9'd a running AI agent mid-task. It resumed without re-spending a cent.

What crash-resumable budget enforcement looks like when the enforcement state lives in a ledger, not in memory. A walkthrough of the kill-9 -> resume demo.

mercoledì 10 giugno 2026 New tab

765 words~3 min read

The premise is simple: a long-running AI agent crashes mid-task. What should happen next?

The common answer is: your orchestrator checkpoints the agent's context, so it can resume from the last step. LangGraph does this with MemorySaver/SQLite. Temporal replays the event log. These work. For the agent's state.

But there's a second thing nobody's checkpointing: the budget envelope around the agent.

What did this run spend before it crashed? How many loops did it complete? How much wall-clock time has elapsed? If those limits live only in memory and the process dies, your safety layer vanishes with it, and a "resumed" agent is really a new agent with a fresh, reset budget.

The failure mode