AWS Data Lake Medallion Architecture
About This Architecture
Medallion architecture on AWS implements a three-tier data lake pattern using S3 buckets for Bronze (raw), Silver (cleaned), and Gold (curated) layers. Streaming data flows from Kinesis Data Streams and DMS into the Bronze S3 bucket, where Glue Crawlers catalog schemas and Glue ETL Jobs progressively refine data through Silver to Gold layers. Lake Formation governs access across all layers while IAM and KMS enforce security, with Athena, QuickSight, Redshift, and SageMaker consuming curated Gold data for analytics and ML. This architecture solves the challenge of managing data quality and governance at scale, enabling data teams to trace lineage from raw ingestion to production-ready datasets. Fork this diagram on Diagrams.so to customize bucket naming conventions, add AWS Glue DataBrew for no-code transformations, or integrate EventBridge for orchestration triggers.
People also ask
How do I design a medallion architecture data lake on AWS with proper governance and security?
Implement a three-tier medallion architecture using S3 buckets for Bronze (raw), Silver (cleaned), and Gold (curated) layers, with AWS Glue ETL Jobs transforming data between tiers, Lake Formation enforcing fine-grained access control, and IAM/KMS securing each layer. This diagram shows the complete flow from Kinesis/DMS ingestion through Athena/QuickSight/SageMaker consumption.
- Domain:
- Data Engineering
- Audience:
- data engineers building scalable data lake architectures on AWS
Generated by Diagrams.so — AI architecture diagram generator with native Draw.io output. Fork this diagram, remix it, or download as .drawio, PNG, or SVG.