Dataflow Automator - Data Engineering Platform

azure · data pipeline diagram.

About This Architecture

Dataflow Automator is a modern data engineering platform that automates multi-layer ETL pipeline creation on Azure Databricks using data contracts and file masks. The system ingests raw CSV, Parquet, and JSON files from DBFS, validates them against schemas and quality rules, then orchestrates Bronze-Silver-Gold medallion architecture through DLT pipelines with automated code generation. A React frontend enables pipeline creation and data discovery, while a FastAPI backend manages file scanning, contract parsing, validation, and Databricks integration via REST API and CLI. Azure DevOps CI/CD deploys Databricks Asset Bundles and Terraform infrastructure, ensuring reproducible, version-controlled data workflows with full monitoring and quarantine handling for rejected records.

People also ask

How do you automate multi-layer ETL pipeline creation on Azure Databricks with data contracts and DLT code generation?

Dataflow Automator automates pipeline creation by scanning DBFS files, validating them against data contracts and quality rules, then generating DLT Python code and Databricks Asset Bundles. The platform orchestrates Bronze (raw validated), Silver (cleaned), and Gold (business-ready) medallion layers, with Azure DevOps CI/CD deploying infrastructure as code and monitoring rejected records in quara

Dataflow Automator - Data Engineering Platform

AzureadvancedAzure DatabricksData EngineeringETL PipelineDLTMedallion ArchitectureCI/CD Automation
Domain: Data EngineeringAudience: Data engineers building automated ETL pipelines on Azure Databricks
0 views0 favoritesPublic

Created by

April 2, 2026

Updated

April 2, 2026 at 10:27 AM

Type

data pipeline

Need a custom architecture diagram?

Describe your architecture in plain English and get a production-ready Draw.io diagram in seconds. Works for AWS, Azure, GCP, Kubernetes, and more.

Generate with AI