Modern Data Stack Migration — Day 1: Scaling to 8+ Companies with DRY Architecture and Chasing a $2M Discrepancy

Hello everyone! Following up on my previous post, Day 1 of my Modern Data Stack migration was an absolute rollercoaster of refactoring and deep data auditing.

I’m moving our legacy system (spreadsheets and Qlik) into a robust pipeline using Python, ClickHouse, and dbt. Here is what went down over the last 24 hours.

1. From Messy Scripts to a Single, Parameterized Extraction Engine 🛠️

In the legacy setup, each company had its own folder, its own .env file, and its own duplicated Python extraction script. It was a maintenance nightmare.

Yesterday, I completely refactored this structure:

Modern Data Stack Migration — Day 1: Scaling to 8+ Companies with DRY Architecture and Chasing a $2M Discrepancy

Related reading

Day 82 of Learning MERN Stack

The Next Decade of Data Engineering: From Modern Data Stack to Data Engineering…

A Weekend of Testing

Day 83 of Learning MERN Stack

The Modern Data Stack Is Broken — Here’s How to Fix It With AI, Governance, and…

Migrating Data Ingestion Systems at Meta Scale