What my leak scanner catches — and the exact line where it stops

I build a small open-source tool (rojaprove) that checks whether an AI app leaks its hidden instructions. This week I spent time finding where it fails, on purpose, so I can tell you the boundary honestly instead of letting a green checkmark imply more than it should.

Here's the short version, and then the detail.

How it works (plain language)

You plant a "canary" — a secret string that should never show up in normal output. Think of it like a marked bill: you write down the serial number, and if that exact number ever turns up somewhere it shouldn't, you know it leaked. The tool sends attack-style prompts to your app, then checks the responses for that exact string. If the canary appears, that's a leak. If not, it passes.

The strength: it's a plain text match, so the verdict is certain and repeatable. No AI guessing whether something "looks risky." The string is there, or it isn't.

Here's the short version, and then the detail.

How it works (plain language)

The strength: it's a plain text match, so the verdict is certain and repeatable. No AI guessing whether something "looks risky." The string is there, or it isn't.

What my leak scanner catches — and the exact line where it stops

What my leak scanner catches — and the exact line where it stops

Related reading

leakproof: stop your AI coding tool from leaking secrets to the cloud (local,…

I tested 5 LLMs for prompt-injection leaks. Same code, 0% to 90%.

I Built a Secret Scanner That Checks Your Git History, Not Just Your Code

Your AI-tool usage is invisible. Here are 4 tiny local tools to see it.

My AI-agent waste detector scored zero false positives. Then I ran it on a real…

Title: I built an AI agent firewall – blocks Aadhaar/PAN/jailbreaks before your…

Related reading

leakproof: stop your AI coding tool from leaking secrets to the cloud (local,…

I tested 5 LLMs for prompt-injection leaks. Same code, 0% to 90%.

I Built a Secret Scanner That Checks Your Git History, Not Just Your Code

Your AI-tool usage is invisible. Here are 4 tiny local tools to see it.

My AI-agent waste detector scored zero false positives. Then I ran it on a real…

Title: I built an AI agent firewall – blocks Aadhaar/PAN/jailbreaks before your…