A few months ago I spent the better part of a day chasing a bug that turned out not to be a bug at all. A downstream dashboard showed revenue had jumped 30% overnight. No deploys, no schema changes, nothing in the logs. After far too long I found it: an upstream system had started sending a total column that no longer equaled subtotal + tax. The pipeline didn't crash. The data just lied, quietly, and everything downstream believed it.
That's the thing about data bugs. They rarely throw exceptions. A status field grows a new typo'd value. A join key starts producing orphans. A nullable column that was "never actually null in practice" suddenly is. None of it crashes anything — it just rots the numbers people make decisions on.
So I built DataPact: a small framework for writing down what your data is supposed to look like, and then enforcing it. It's a data quality and data-contract validation tool, and the whole thing runs on the Python standard library. No pandas, no PyYAML, no network calls.
Live demo report: https://hajirufai.github.io/datapact/report.html
Landing page: https://hajirufai.github.io/datapact/







