About This Architecture
Dataflow Automator is a modern data engineering platform that automates multi-layer ETL pipeline creation on Azure Databricks using data contracts and file masks. The system ingests raw CSV, Parquet, and JSON files from DBFS, validates them against schemas and quality rules, then orchestrates Bronze-Silver-Gold medallion architecture through DLT pipelines with automated code generation. A React frontend enables pipeline creation and data discovery, while a FastAPI backend manages file scanning, contract parsing, validation, and Databricks integration via REST API and CLI. Azure DevOps CI/CD deploys Databricks Asset Bundles and Terraform infrastructure, ensuring reproducible, version-controlled data workflows with full monitoring and quarantine handling for rejected records.