My detector passed every synthetic test with zero false positives. Then I pointed it at one real trace and found a crack.
This is the honest version of where I am. I'm building Clew — a tool that finds the redundant loops, re-queries, and handoffs that silently burn tokens when multiple AI agents work together. No crash, no error, just two agents quietly re-doing each other's work while the token bill climbs.
I build in public, and I publish the negatives. So here's the whole arc, including the part that isn't working yet.
First, I killed my own hypothesis
The original idea wasn't waste detection at all. It was failure prediction: watch the behavior between agents and forecast multi-agent failures before they happen. The differentiator was a single metric built on two signals — structural cycles in the inter-agent message graph, and the decay of novelty in embeddings.






