I Fuzzed 12 LLMs With 19 Payloads — Here's What Broke

Everyone's shipping AI agents. Nobody's testing them.

I ran EXORR's prompt fuzzer — 19 payloads across 5 attack categories — against 12 popular LLM endpoints. The results were worse than I expected.

The Setup

exorr-prompt-fuzzer ships 5 attack categories out of the box: