My AI-agent waste detector scored zero false positives. Then I ran it on a real trace.

My detector passed every synthetic test with zero false positives. Then I pointed it at one real trace and found a crack.

This is the honest version of where I am. I'm building Clew — a tool that finds the redundant loops, re-queries, and handoffs that silently burn tokens when multiple AI agents work together. No crash, no error, just two agents quietly re-doing each other's work while the token bill climbs.

I build in public, and I publish the negatives. So here's the whole arc, including the part that isn't working yet.

First, I killed my own hypothesis

The original idea wasn't waste detection at all. It was failure prediction: watch the behavior between agents and forecast multi-agent failures before they happen. The differentiator was a single metric built on two signals — structural cycles in the inter-agent message graph, and the decay of novelty in embeddings.

My AI-agent waste detector scored zero false positives. Then I ran it on a real trace.

Other newsrooms on this story

Related reading

AI agents don't crash. They fail silently. Here's how to catch it in Claude…

Your AI Agent Is a Black Box—Until OpenTelemetry & SigNoz Step In

I built TraceGate because my AI agent demo passed, but the traces told a…

"Loop Engineering: How I Stopped My AI Agent From Reward-Hacking Its Own…

Detect AI Agent Hallucinations: Zero-Shot Methods

I built a tool to catch AI coding agents misbehaving — and put zero AI in it