I didn't. For weeks, I ran Claude Code sessions that cost 30K to 100K tokens without checking. Some were deep architectural work that justified every token. Others were "rename this file" requests that loaded my entire 1,200-line personality config before doing 15 seconds of work.
The problem isn't just that AI agents cost money. It's that we don't know when to act on cost data — and we don't even have the data to make that call.
So I built a metabolic layer. Not a dashboard. Not an enterprise observability platform. Just 15 lines of Python embedded in the Stop hook that already runs at the end of every session.
The harder part wasn't the code. It was figuring out when the data was actually telling me something.
When does data become a decision?






