What Is Cloud Architecture?

Cloud architecture defines how compute, storage, networking, and security services combine to deliver applications at scale. This guide covers the components, patterns, and provider differences that shape real infrastructure.

Defining cloud architecture beyond the marketing

Cloud architecture is the design of a system's components, services, and their interactions when deployed on a cloud provider's infrastructure. What is cloud architecture in practice? It's the set of decisions about which managed services to use, how they connect, where data lives, how traffic routes, and what happens when something fails. It's not a single diagram. It's the collection of design choices that determine your system's reliability, cost, performance, and security posture. Every cloud architecture starts with a fundamental choice: IaaS, PaaS, or SaaS. IaaS gives you virtual machines and raw storage, like EC2 instances and EBS volumes on AWS. You manage the OS, runtime, and application. PaaS abstracts the infrastructure entirely. AWS Elastic Beanstalk, Azure App Service, and Google App Engine deploy your code without you provisioning servers. Serverless goes further, billing per invocation rather than per hour. The choice cascades through every subsequent decision. Pick IaaS and you're responsible for patching, scaling, and monitoring at the OS level. Pick serverless and you trade that operational burden for constraints on execution duration, memory limits, and cold start latency. A well-designed cloud architecture looks different depending on the workload. A real-time bidding platform serving 500,000 requests per second has different architecture requirements than a batch analytics pipeline processing 10TB nightly. The architecture must match the workload's latency, throughput, durability, and cost constraints.

Core components: compute, storage, networking, security, observability

Compute is where your code runs. The options range from bare-metal instances to containers to functions. AWS offers EC2 (VMs), ECS and EKS (containers), Lambda (functions), and Fargate (serverless containers). Each has different scaling characteristics. Lambda scales to zero and spins up in milliseconds for event-driven workloads. EKS gives you Kubernetes orchestration with fine-grained pod scheduling for long-running services. Storage splits into three categories: object, block, and file. Object storage (S3, Azure Blob, GCS) stores unstructured data with virtually unlimited capacity at low cost. Block storage (EBS, Azure Managed Disks, Persistent Disks) attaches to compute instances for database volumes and boot disks. File storage (EFS, Azure Files, Filestore) provides shared POSIX-compatible file systems. Each has different IOPS, throughput, and durability characteristics. S3 offers 99.999999999% durability. EBS gp3 volumes provide 3,000 baseline IOPS with burst capability. Networking connects everything. A VPC (Virtual Private Cloud) is your isolated network in the cloud. You define CIDR blocks, create subnets (public and private), configure route tables, and attach internet gateways or NAT gateways. Security groups act as stateful firewalls at the instance level. NACLs provide stateless filtering at the subnet level. Security in cloud architecture is identity-centric. IAM policies define who can do what to which resources. The principle of least privilege means granting only the permissions a service actually needs. Observability ties it together. CloudWatch (AWS), Azure Monitor, and Cloud Monitoring (GCP) collect metrics, logs, and traces. Without observability, your architecture is a black box.

Common patterns: three-tier, serverless, data lake, event-driven

The three-tier pattern is the workhorse of web applications. A load balancer distributes traffic across web servers (presentation tier), which call application servers (logic tier), which query databases (data tier). On AWS, this looks like ALB forwarding to an ECS cluster running your API, backed by RDS PostgreSQL with a read replica. Simple, well-understood, and battle-tested. Its weakness is scaling: all three tiers must scale together or you create bottlenecks at tier boundaries. Serverless architecture eliminates server management entirely. API Gateway receives HTTP requests, invokes Lambda functions, and returns responses. DynamoDB provides single-digit millisecond reads at any scale. S3 hosts the frontend as a static site behind CloudFront. Stripe's webhook processing uses this pattern: events arrive via API Gateway, Lambda processes each event, and DynamoDB stores the result. The trade-off is cold starts (100ms to 2s on first invocation), 15-minute execution limits on Lambda, and vendor lock-in to the provider's function runtime. Data lake architecture centralizes raw data in object storage for analytics. Ingestion layers (Kinesis Data Firehose, Azure Event Hubs) stream data into S3 or ADLS Gen2. Catalog services (AWS Glue Data Catalog, Azure Purview) track schema and lineage. Query engines (Athena, Synapse, BigQuery) run SQL over the raw files. Airbnb's data platform follows this pattern with Apache Hive tables partitioned by date in S3, cataloged in their internal Dataportal tool. Event-driven architecture decouples producers from consumers using message brokers. Amazon SNS fans out events to multiple SQS queues, each consumed by a different service. Azure Event Grid routes events from Blob Storage to Azure Functions. This pattern excels when services need to react to changes without tight coupling. Shopify processes millions of webhook events per minute using event-driven architecture across their platform.

Provider differences: AWS vs Azure vs GCP naming and organization

The three major cloud providers offer equivalent services with different names, pricing models, and operational characteristics. Knowing the mapping saves hours of confusion when reading cross-cloud documentation. For compute: AWS EC2 maps to Azure Virtual Machines and GCP Compute Engine. AWS Lambda maps to Azure Functions and GCP Cloud Functions. AWS ECS maps roughly to Azure Container Instances, while EKS, AKS, and GKE all run managed Kubernetes but differ in default configurations. GKE enables auto-upgrade by default. EKS does not. For databases: AWS RDS maps to Azure SQL Database and GCP Cloud SQL for relational workloads. DynamoDB maps to Azure Cosmos DB (with Table API) and GCP Bigtable for key-value stores. Aurora Serverless has no direct Azure equivalent, though Cosmos DB serverless mode covers some of the same use cases. For messaging: AWS SQS maps to Azure Queue Storage (basic) or Azure Service Bus (advanced). AWS SNS maps to Azure Event Grid for event routing. GCP Pub/Sub combines both queue and topic semantics in a single service, which simplifies the architecture but limits fine-grained control. Networking diverges significantly. AWS VPCs, Azure VNets, and GCP VPC networks use different subnet models. Azure subnets span availability zones by default. AWS subnets are zone-specific. GCP VPC networks are global, with subnets being regional. This difference affects how you design multi-AZ deployments. IAM models differ too. AWS uses JSON policy documents attached to roles and users. Azure uses Azure RBAC with built-in role definitions. GCP uses a resource hierarchy (org > folder > project) with IAM bindings at each level. GCP's model is the most intuitive for organizations with deep project hierarchies. AWS gives the most granular control at the cost of complex policy management.

The Well-Architected Framework: six pillars that structure reviews

AWS published the Well-Architected Framework in 2015, and it's since become the industry standard for evaluating cloud architecture quality. Azure and GCP have their own versions, but they follow the same structure. The six pillars provide a checklist for architecture reviews. Operational Excellence focuses on automation and observability. Can you deploy without manual steps? Do you have runbooks for common incidents? Are your CloudWatch dashboards showing the four golden signals: latency, traffic, errors, and saturation? Security covers identity, detection, infrastructure protection, data protection, and incident response. Encrypt data at rest with KMS. Encrypt data in transit with TLS 1.3. Rotate credentials automatically. Log every API call with CloudTrail. Use VPC flow logs to detect anomalous traffic patterns. Reliability means the system works correctly and recovers from failures. Multi-AZ deployments for stateful services. Health checks with automatic replacement of unhealthy instances. Circuit breakers to prevent cascading failures. RTO (Recovery Time Objective) and RPO (Recovery Point Objective) should be documented for every critical service. Performance Efficiency is about using the right resource types and sizes. Don't run a c5.4xlarge when a t3.medium handles the load. Use caching aggressively: CloudFront for static assets, ElastiCache for session data, DAX for DynamoDB hot keys. Benchmark before and after changes. Cost Optimization prevents cloud bills from spiraling. Use Savings Plans or Reserved Instances for predictable workloads. Spot instances for fault-tolerant batch jobs. Right-size instances monthly. Tag every resource for cost allocation. Set budget alerts at 50%, 80%, and 100% thresholds. Sustainability, the newest pillar, addresses the environmental impact of cloud workloads. Right-sizing, efficient code, and choosing regions powered by renewable energy all contribute. Use Graviton (ARM) instances on AWS for better performance per watt.

Diagramming cloud architecture effectively

A cloud architecture diagram needs to show the right level of detail for its audience. An executive reviewing a migration proposal needs the system context view: on-premises data center on the left, arrow labeled 'AWS DMS' pointing to AWS cloud on the right, with RDS, ECS, and S3 inside. An engineer implementing that migration needs the container view: specific VPC CIDR blocks, subnet layouts, security group rules, and IAM role chains. Use official provider icons. AWS publishes an icon set updated annually with every new service. Azure and GCP do the same. These icons are instantly recognizable to engineers who work with those platforms daily. A generic database cylinder is ambiguous. An RDS icon with a PostgreSQL label is precise. Group services inside visual boundaries that represent real isolation. Draw a VPC boundary. Inside it, draw public and private subnet groups. Place ALBs in the public subnet, application containers in the private subnet, and RDS in a separate data subnet. This layout mirrors the actual network topology and makes security review straightforward. Show data flow direction with labeled arrows. 'HTTPS :443' from CloudFront to ALB. 'TCP :5432' from ECS tasks to RDS. 'HTTPS :443' from Lambda to SQS. Every arrow should answer: what protocol, what direction, and optionally what data. Diagrams.so generates cloud architecture diagrams from text descriptions with provider-specific icon sets for AWS, Azure, and GCP. Describe your infrastructure in plain language, select the cloud provider, and get a .drawio file with correct icons, labeled connections, and grid-aligned layout. The output opens in Draw.io, VS Code, Confluence, or any mxGraph-compatible editor.

Real-world examples

Generate these diagrams with AI

Related guides