Read-Write ETL on NAS Data with EMR Serverless Spark — No Cluster, No Copy

TL;DR In Part 1, Athena provided serverless read-only SQL. In Part 2, Databricks hit...

martedì 26 maggio 2026 New tab

2,188 words~10 min read

TL;DR

In Part 1, Athena provided serverless read-only SQL. In Part 2, Databricks hit session policy boundaries. In Part 3, Snowflake works with config. In Part 4, DuckDB Lambda delivered the cheapest path. This Part 5 shows the full-power Spark ETL path with write-back.

EMR Serverless Spark can read, transform, and write-back Parquet files on FSx for ONTAP via S3 Access Points. Total Spark execution: 16 seconds for a full ETL pipeline (read → aggregate → window → write). Job total including cold start: 37 seconds. Cost: ~$0.05 per job.

No cluster to manage. No data to copy. No idle cost.

Quick Decision Guide:

Other newsrooms on this story

· 1 sources

Full timeline →

snowflake.com·May 30, 2026 · 1 mesi fa
Engineering Blog

Read-Write ETL on NAS Data with EMR Serverless Spark — No Cluster, No Copy

Other newsrooms on this story

Read-Write ETL on NAS Data with EMR Serverless Spark — No Cluster, No Copy

Other newsrooms on this story

Related reading

Redshift Spectrum + Lake Formation — Enterprise Governance on NAS Data

Exploring Snowpark While Comparing It with Apache Spark

EC2 to Serverless: Modernizing FSx for ONTAP Splunk Integration

From Hours to Seconds: An AI-Powered Metadata Catalog for Unstructured Data on…

A Decision Framework for ETL Migration to Databricks

Snowflake vs Databricks, BigQuery vs Redshift? The 2026 Guide to Right-Sizing…

Related reading

Redshift Spectrum + Lake Formation — Enterprise Governance on NAS Data

Exploring Snowpark While Comparing It with Apache Spark

EC2 to Serverless: Modernizing FSx for ONTAP Splunk Integration

From Hours to Seconds: An AI-Powered Metadata Catalog for Unstructured Data on…

A Decision Framework for ETL Migration to Databricks

Snowflake vs Databricks, BigQuery vs Redshift? The 2026 Guide to Right-Sizing…