AbbVie Data Lake AWS Architecture

AWSNetworkadvanced
AbbVie Data Lake AWS Architecture — AWS network diagram

About This Architecture

AbbVie's multi-AZ data lake architecture on AWS ingests structured data from SAP, Veeva, Fieldglass, and Compass alongside external clinical and health authority sources via Lambda, EventBridge, SQS, and MSK Kafka, landing in S3 raw buckets. Data flows through EMR Spark ETL, AWS Glue serverless jobs, and Redshift Spectrum for transformation and querying, with Glue Data Catalog providing Hive-compatible metadata and Lake Formation enforcing tag-based row/column security. The design spans three availability zones in us-east-1 with private subnets for EKS, ECS Fargate, MWAA orchestration, and SageMaker, secured by Interface VPC Endpoints, KMS multi-region encryption, and CloudTrail audit logging. Hybrid connectivity to legacy Cloudera on-prem via Transit Gateway and Site-to-Site VPN enables gradual cloud migration while maintaining governance. Fork this diagram to customize data source connectors, add cross-region replication policies, or adapt the security posture for your regulated data environment.

People also ask

How do you design a secure, multi-AZ AWS data lake that ingests from SAP and external sources while maintaining governance and hybrid on-prem connectivity?

AbbVie's architecture uses Lambda, EventBridge, and MSK Kafka to ingest data into S3 buckets across three AZs, then processes via EMR Spark and Glue with Hive-compatible metadata. Lake Formation enforces tag-based row/column security, KMS provides multi-region encryption, and Transit Gateway bridges legacy Cloudera on-prem systems—enabling governed, scalable analytics with audit trails via CloudTr

AWSdata-lakeEMRLake-Formationmulti-AZhybrid-cloud
Domain:
Cloud Aws
Audience:
AWS solutions architects designing enterprise data lake platforms

Generated by Diagrams.so — AI architecture diagram generator with native Draw.io output. Fork this diagram, remix it, or download as .drawio, PNG, or SVG.

Generate your own networkdiagram →

About This Architecture

AbbVie's multi-AZ data lake architecture on AWS ingests structured data from SAP, Veeva, Fieldglass, and Compass alongside external clinical and health authority sources via Lambda, EventBridge, SQS, and MSK Kafka, landing in S3 raw buckets. Data flows through EMR Spark ETL, AWS Glue serverless jobs, and Redshift Spectrum for transformation and querying, with Glue Data Catalog providing Hive-compatible metadata and Lake Formation enforcing tag-based row/column security. The design spans three availability zones in us-east-1 with private subnets for EKS, ECS Fargate, MWAA orchestration, and SageMaker, secured by Interface VPC Endpoints, KMS multi-region encryption, and CloudTrail audit logging. Hybrid connectivity to legacy Cloudera on-prem via Transit Gateway and Site-to-Site VPN enables gradual cloud migration while maintaining governance. Fork this diagram to customize data source connectors, add cross-region replication policies, or adapt the security posture for your regulated data environment.

People also ask

How do you design a secure, multi-AZ AWS data lake that ingests from SAP and external sources while maintaining governance and hybrid on-prem connectivity?

AbbVie's architecture uses Lambda, EventBridge, and MSK Kafka to ingest data into S3 buckets across three AZs, then processes via EMR Spark and Glue with Hive-compatible metadata. Lake Formation enforces tag-based row/column security, KMS provides multi-region encryption, and Transit Gateway bridges legacy Cloudera on-prem systems—enabling governed, scalable analytics with audit trails via CloudTr

AbbVie Data Lake AWS Architecture

AWSadvanceddata-lakeEMRLake-Formationmulti-AZhybrid-cloud
Domain: Cloud AwsAudience: AWS solutions architects designing enterprise data lake platforms
0 views0 favoritesPublic

Created by

June 3, 2026

Updated

June 3, 2026 at 3:22 AM

Type

network

Need a custom architecture diagram?

Describe your architecture in plain English and get a production-ready Draw.io diagram in seconds. Works for AWS, Azure, GCP, Kubernetes, and more.

Generate with AI