I Audited My Own Open-Source Project With 26 AI Agents (and Found a Real Vulnerability)

ShareBox is my self-hosted streaming server: a PHP thing I built because I just wanted to send someone a link to a movie without installing Plex and its ten gigabytes of dependencies. It runs on my seedbox, serves my users, and one morning I notice it's starting to pick up a few stars on GitHub.

And then, that little voice: "does this thing actually hold up?" Because between "works on my machine" and "code that strangers are going to install on their own box," there's a chasm. A chasm full of flaws I can't see anymore, because I've had my nose in it for weeks.

Normally, you re-read your code. Except re-reading 22,000 lines alone, honestly, you do it badly: you skim over what you think you already know. So I tried something else — unleashing a pack of 26 AI agents on it, each with a precise mission, and seeing what surfaced. Spoiler: they found a flaw that had been sitting right under my eyes from the start.

26 agents to comb through my own code

The idea wasn't "AI, tell me if my code is good" — that always produces the same encouraging, useless mush. The idea was to orchestrate: split the audit into roles, run the agents in parallel, then have a final, deliberately harsh agent tear apart the conclusions.

I Audited My Own Open-Source Project With 26 AI Agents (and Found a Real Vulnerability)

Other newsrooms on this story

Related reading

Millions of AI agents imperiled by critical vulnerability in open source package

Building a Side Project with AI Pair Programming: Lessons Learned with Sharebox

My Self-Hosted AI Assistant Kept Overwriting Its Own API Keys: A Zero-Trust…

I filed a critical bug against my own tool. Then I read the code — and my own…

AI found 300 WordPress plugin zero-days in 72 hours. I build plugins. Here's…

Hardening my own Nmap web UI: the security holes I shipped, and what actually…