I tested 5 LLMs for prompt-injection leaks. Same code, 0% to 90%.

I built a scanner that fires prompt-injection probes at a self-hosted AI agent and checks whether it leaks (a) real secret-shaped strings (API keys) or (b) the content of its own system prompt. Then I ran the same agent across 5 model backends. The leak rate ranged from 0% to 90% depending only on the model.

Here's what I found and how it works.

Why this matters now

Prompt injection is #1 on the OWASP 2025 LLM Top 10. It's not theoretical anymore:

EchoLeak (CVE-2025-32711, CVSS 9.3) — a zero-click flaw in Microsoft 365 Copilot. One crafted email could exfiltrate internal files and API keys with no user interaction. Notably, the payload bypassed Microsoft's prompt-injection classifier by reading like ordinary business text.

Here's what I found and how it works.

Why this matters now

Prompt injection is #1 on the OWASP 2025 LLM Top 10. It's not theoretical anymore:

I tested 5 LLMs for prompt-injection leaks. Same code, 0% to 90%.

I tested 5 LLMs for prompt-injection leaks. Same code, 0% to 90%.

Other newsrooms on this story

Related reading

Prompt Injection in 2026: Still OWASP's Number One LLM Vulnerability

Prompt injection disclosures: 4 labs compared

I Built a Prompt Injection Detector with 98% Recall on Unseen Attacks. Here's…

AI Prompt Injection Defense: Building Effective Strategies in 5 Steps

Indirect Prompt Injection remains a fundamental security challenge for AI |…

Three prompt injection stories from this week that your guardrail probably…

Other newsrooms on this story

Related reading

Prompt Injection in 2026: Still OWASP's Number One LLM Vulnerability

Prompt injection disclosures: 4 labs compared

I Built a Prompt Injection Detector with 98% Recall on Unseen Attacks. Here's…

AI Prompt Injection Defense: Building Effective Strategies in 5 Steps

Indirect Prompt Injection remains a fundamental security challenge for AI |…

Three prompt injection stories from this week that your guardrail probably…