In a single-agent system, failure is simple: the agent errors, you retry.

In multi-agent systems, failure is a graph problem.

The Cascade Failure Problem

Agent A: ✅ Success

Agent B: ❌ Timeout (depends on A)