Azure Databricks ETL - AS-IS vs TO-BE Architecture
About This Architecture
Azure Databricks ETL architecture comparison showing migration from always-on shared clusters to ephemeral job clusters orchestrated by Azure Data Factory and Databricks Workflows. The AS-IS design uses sequential notebook activities on a persistent All-Purpose Cluster connected to ADLS Gen2 and Snowflake, incurring continuous compute costs. The TO-BE architecture replaces this with Databricks Job Triggers and Workflow Jobs that auto-create Job Clusters per run, reducing idle time and infrastructure spend. This shift demonstrates best practices for cost-efficient data pipelines while maintaining audit logging through dedicated Audit Start and Audit End notebooks. Fork and customize this diagram on Diagrams.so to align with your organization's migration roadmap and cluster sizing strategy.
People also ask
How do I reduce Azure Databricks compute costs by switching from shared clusters to ephemeral job clusters?
This diagram compares AS-IS (always-on All-Purpose Cluster) vs TO-BE (ephemeral Job Clusters) architectures. The TO-BE design uses Databricks Workflows and Job Triggers to auto-create clusters per ETL run, eliminating idle costs while maintaining audit logging and data flow to ADLS Gen2 and Snowflake.
- Domain:
- Data Engineering
- Audience:
- Data engineers optimizing Azure Databricks ETL pipelines for cost and performance
Generated by Diagrams.so — AI architecture diagram generator with native Draw.io output. Fork this diagram, remix it, or download as .drawio, PNG, or SVG.