I've been building databridge.so by myself for a while. It's an email list cleaner that explains every decision. Most cleaners give you back "74/100, risky" and that's all you get. You cannot audit it. So I built one where every row in the cleaned CSV carries the actual reason it was flagged.

A few things I expected to be quick that weren't.

CSV parsing ate the most time

This is the part I'd skip in any project brief. "Oh, we'll just use a CSV library." Then you get a real export from a real CRM and it has:

A UTF-8 BOM at the start that breaks header parsing if you don't strip it