About This Architecture
Enterprise RAG solution on GCP using VPC custom routing across four isolated subnets for ingestion, query, evaluation, and observability. Raw documents flow through Cloud Dataflow and Vertex AI Embedding Model into Vector Search, while user queries traverse Cloud CDN, Cloud Armor WAF, and Cloud Load Balancing to Cloud Run Query Pipeline for LLM inference. Evaluation subnet runs Cloud Functions-triggered model evaluation against ground truth data, with results stored in BigQuery and metrics streamed to Cloud Monitoring. Fork this diagram to customize subnet CIDR ranges, add Cloud VPN for hybrid connectivity, or integrate additional Vertex AI services.