I do not trust an AI-generated MVP when it first looks good.

I trust it only after I can score it.

That is the point where I stop writing bigger prompts and start running a small review loop against the output. Lately I have been doing that with NxCode because it gets me from a rough product idea to a reviewable app structure quickly enough to make the scoring pass worth doing.

The scoring loop

I use 5 checks before I let a prototype become engineering work.