Azure Databricks ETL - AS-IS vs TO-BE Architecture

azure · architecture diagram.

About This Architecture

Azure Databricks ETL architecture comparison showing migration from always-on shared clusters to ephemeral job clusters orchestrated by Azure Data Factory and Databricks Workflows. The AS-IS design uses sequential notebook activities on a persistent All-Purpose Cluster connected to ADLS Gen2 and Snowflake, incurring continuous compute costs. The TO-BE architecture replaces this with Databricks Job Triggers and Workflow Jobs that auto-create Job Clusters per run, reducing idle time and infrastructure spend. This shift demonstrates best practices for cost-efficient data pipelines while maintaining audit logging through dedicated Audit Start and Audit End notebooks. Fork and customize this diagram on Diagrams.so to align with your organization's migration roadmap and cluster sizing strategy.

People also ask

How do I reduce Azure Databricks compute costs by switching from shared clusters to ephemeral job clusters?

This diagram compares AS-IS (always-on All-Purpose Cluster) vs TO-BE (ephemeral Job Clusters) architectures. The TO-BE design uses Databricks Workflows and Job Triggers to auto-create clusters per ETL run, eliminating idle costs while maintaining audit logging and data flow to ADLS Gen2 and Snowflake.

Azure Databricks ETL - AS-IS vs TO-BE Architecture

AzureintermediateAzure DatabricksETLCost OptimizationData EngineeringDatabricks WorkflowsAzure Data Factory
Domain: Data EngineeringAudience: Data engineers optimizing Azure Databricks ETL pipelines for cost and performance
0 views0 favoritesPublic

Created by

March 10, 2026

Updated

March 10, 2026 at 12:25 PM

Type

architecture

Need a custom architecture diagram?

Describe your architecture in plain English and get a production-ready Draw.io diagram in seconds. Works for AWS, Azure, GCP, Kubernetes, and more.

Generate with AI