Scarab Field Test: Repairing an AI-Generated App Without Guessing Its Intended Baseline
I’ve been building Scarab Diagnostic Suite around a problem I keep seeing in AI-assisted development: the app may look close, the code may be mostly there, and some checks may even pass — but the repo still isn’t in a trustworthy state.
So I tested Scarab against a public GitHub repo that was explicitly asking for help with an AI-generated web app. The app had been created through a generated/vibe-coded workflow and the owner was looking for help cleaning it up, fixing broken behavior, and making it more stable.
The interesting part wasn’t just “can the code be fixed?”
The interesting part was: what does fixed mean for this repo?









