Six months ago I shipped an AI quiz that matches aspiring founders to a business idea they can actually build. The matcher only works if the underlying idea library is large, fresh, and not full of slop. So I had to build the pipeline that fills it.
This post walks through the real architecture: prompt design, the validation gate, the day it silently produced zero ideas for 48 hours, and what we'd cut if we started over. If you're building anything that uses LLMs to generate structured content at scale, you'll probably hit the same walls.
The product this powers is AI Student Factory — but the pipeline is generic.
The constraint that shaped everything
The matcher quiz returns a single idea per user. If that idea is bad, the entire product is bad. So the bar wasn't "generate lots of ideas" — it was "every idea in the library has to be someone's reasonable next 6 months."








