Batch and Streaming Data Platform Flow
About This Architecture
Hybrid batch and streaming data platform on Azure Databricks ingests JSON data through Service Bus and Event Hub, landing raw payloads in Blob Storage before transformation. Batch jobs run on schedule via Databricks, while streaming workflows use Autoloader and Event Grid to detect and process new files continuously into Bronze and Silver Delta tables. Both flows apply auto-flatten logic to normalize nested JSON to the first level, enabling unified downstream consumption. Fork this diagram to customize ingestion schedules, schema validation rules, or add Gold layer aggregations. This architecture demonstrates Azure's event-driven medallion pattern for real-time and batch workloads in a single platform.
People also ask
How do I build a hybrid batch and streaming data pipeline on Azure Databricks?
This diagram shows a medallion architecture where batch jobs ingest JSON from Azure Service Bus on schedule, while streaming workflows continuously pull events from Event Hub via Autoloader. Both flows land raw data in Blob Storage, then use Event Grid and Databricks Autoloader to detect new files and load them into Bronze Delta tables, which are auto-flattened to Silver for downstream consumption
- Domain:
- Data Engineering
- Audience:
- Data engineers building hybrid batch-streaming pipelines on Azure Databricks
Generated by Diagrams.so — AI architecture diagram generator with native Draw.io output. Fork this diagram, remix it, or download as .drawio, PNG, or SVG.