ShareBox is my self-hosted streaming server: a PHP thing I built because I just wanted to send someone a link to a movie without installing Plex and its ten gigabytes of dependencies. It runs on my seedbox, serves my users, and one morning I notice it's starting to pick up a few stars on GitHub.

And then, that little voice: "does this thing actually hold up?" Because between "works on my machine" and "code that strangers are going to install on their own box," there's a chasm. A chasm full of flaws I can't see anymore, because I've had my nose in it for weeks.

Normally, you re-read your code. Except re-reading 22,000 lines alone, honestly, you do it badly: you skim over what you think you already know. So I tried something else — unleashing a pack of 26 AI agents on it, each with a precise mission, and seeing what surfaced. Spoiler: they found a flaw that had been sitting right under my eyes from the start.

26 agents to comb through my own code

The idea wasn't "AI, tell me if my code is good" — that always produces the same encouraging, useless mush. The idea was to orchestrate: split the audit into roles, run the agents in parallel, then have a final, deliberately harsh agent tear apart the conclusions.