A Decision Framework for ETL Migration to Databricks

How to choose between Lakehouse, Spark Declarative Pipelines, or PySpark, and when to combine them

by Rafael Aielo

Your team has hundreds of stored procedures, a couple of schedulers, permissions scattered across roles and schemas, and a cloud data warehouse renewal deadline coming up. Nobody agrees on what to move first. Some want to rewrite everything in PySpark. Others want to move SQL as-is and call it done. Lost in the conversation: the metadata, lineage, and permissions that move with the code, plus the opportunity to consolidate them on the way.

Neither extreme works. The teams that succeed at data warehouse migration look at each workload individually and pick the right tool for the job. This post suggests a decision framework for selection: when to use Lakehouse (Databricks SQL), Spark Declarative Pipelines, or PySpark, and how to phase the work so you ship results instead of stalling on a plan.

On Databricks, you can migrate ETL pipelines in three primary ways, often used together.

How to choose between Lakehouse, Spark Declarative Pipelines, or PySpark, and when to combine them

by Rafael Aielo

On Databricks, you can migrate ETL pipelines in three primary ways, often used together.

A Decision Framework for ETL Migration to Databricks

A Decision Framework for ETL Migration to Databricks

Other newsrooms on this story

Related reading

Azure Databricks for MLOps and Feature Engineering at Scale with Apache Spark,…

AI for Data Pipelines & ETL in 2026: dbt AI vs Airflow vs Prefect vs Fivetran

Building a SQL ETL Pipeline: The Complete Guide for Data Engineers

Optimize Spark and Databricks jobs with Datadog | Datadog

Why Metadata-Driven ETL Frameworks Scale Better Than Hardcoded Pipelines — and…

Building a PySpark and AWS Glue ETL Pipeline for Search Keyword Revenue Analysis

Other newsrooms on this story

Related reading

Azure Databricks for MLOps and Feature Engineering at Scale with Apache Spark,…

AI for Data Pipelines & ETL in 2026: dbt AI vs Airflow vs Prefect vs Fivetran

Building a SQL ETL Pipeline: The Complete Guide for Data Engineers

Optimize Spark and Databricks jobs with Datadog | Datadog

Why Metadata-Driven ETL Frameworks Scale Better Than Hardcoded Pipelines — and…

Building a PySpark and AWS Glue ETL Pipeline for Search Keyword Revenue Analysis