Questions
Design a secure VPC architecture with private GKE clusters and controlled internet access.
The Scenario
You’re designing the network architecture for a healthcare application that must:
- Keep all compute resources private (no public IPs)
- Allow GKE pods to pull images from gcr.io and access GCP APIs
- Enable controlled outbound internet access for specific services
- Support multiple environments (dev/staging/prod) with isolation
- Meet HIPAA compliance requirements
The Challenge
Design a VPC architecture using GCP networking primitives: Private Google Access, Cloud NAT, Shared VPC, and firewall rules. Explain the tradeoffs.
A junior engineer might assign public IPs to all resources for simplicity, use default firewall rules, create separate VPCs without connectivity, or skip Private Google Access. This creates security vulnerabilities, compliance violations, and operational complexity.
A senior engineer designs a Shared VPC with private subnets, enables Private Google Access for GCP API calls, configures Cloud NAT for controlled outbound access, uses hierarchical firewall policies, and implements VPC Service Controls for data exfiltration prevention.
Architecture Overview
┌─────────────────────────────────────────────────────────────────┐
│ Host Project │
│ (Shared VPC) │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ VPC Network │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
│ │ │ us-central1 │ │ us-east1 │ │ europe-west1│ │ │
│ │ │ 10.0.0.0/20 │ │ 10.1.0.0/20 │ │ 10.2.0.0/20 │ │ │
│ │ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │ │
│ │ │ │ │ │ │
│ │ └────────────────┼────────────────┘ │ │
│ │ │ │ │
│ │ ┌──────────────────────┴──────────────────────┐ │ │
│ │ │ Cloud Router │ │ │
│ │ │ + Cloud NAT (all regions) │ │ │
│ │ └─────────────────────────────────────────────┘ │ │
│ └────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
│
┌─────────────────────┼─────────────────────┐
│ │ │
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Dev Project │ │Staging Project│ │ Prod Project │
│(Service Proj)│ │(Service Proj) │ │(Service Proj)│
│ GKE, GCE │ │ GKE, GCE │ │ GKE, GCE │
└──────────────┘ └──────────────┘ └──────────────┘Step 1: Create Shared VPC Host Project
# Enable Shared VPC in host project
gcloud compute shared-vpc enable host-project
# Associate service projects
gcloud compute shared-vpc associated-projects add dev-project \
--host-project=host-project
gcloud compute shared-vpc associated-projects add prod-project \
--host-project=host-projectStep 2: Create VPC and Subnets
# terraform/network/main.tf
resource "google_compute_network" "main" {
name = "shared-vpc"
project = var.host_project
auto_create_subnetworks = false
routing_mode = "GLOBAL"
}
# Regional subnets with secondary ranges for GKE
resource "google_compute_subnetwork" "regional" {
for_each = var.regions
name = "subnet-${each.key}"
project = var.host_project
region = each.key
network = google_compute_network.main.id
ip_cidr_range = each.value.primary_range
private_ip_google_access = true # Critical for private clusters!
# Secondary ranges for GKE pods and services
secondary_ip_range {
range_name = "pods"
ip_cidr_range = each.value.pods_range
}
secondary_ip_range {
range_name = "services"
ip_cidr_range = each.value.services_range
}
log_config {
aggregation_interval = "INTERVAL_5_SEC"
flow_sampling = 0.5
metadata = "INCLUDE_ALL_METADATA"
}
}
variable "regions" {
default = {
"us-central1" = {
primary_range = "10.0.0.0/20"
pods_range = "10.100.0.0/14"
services_range = "10.104.0.0/20"
}
"us-east1" = {
primary_range = "10.1.0.0/20"
pods_range = "10.108.0.0/14"
services_range = "10.112.0.0/20"
}
}
}Step 3: Configure Cloud NAT for Outbound Access
# Cloud Router per region
resource "google_compute_router" "regional" {
for_each = var.regions
name = "router-${each.key}"
project = var.host_project
region = each.key
network = google_compute_network.main.id
}
# Cloud NAT for outbound internet access
resource "google_compute_router_nat" "regional" {
for_each = var.regions
name = "nat-${each.key}"
project = var.host_project
router = google_compute_router.regional[each.key].name
region = each.key
nat_ip_allocate_option = "AUTO_ONLY"
source_subnetwork_ip_ranges_to_nat = "ALL_SUBNETWORKS_ALL_IP_RANGES"
log_config {
enable = true
filter = "ERRORS_ONLY"
}
# Timeouts for connection tracking
tcp_established_idle_timeout_sec = 1200
tcp_transitory_idle_timeout_sec = 30
udp_idle_timeout_sec = 30
}Step 4: Create Private GKE Cluster
resource "google_container_cluster" "private" {
name = "private-cluster"
project = var.service_project
location = "us-central1"
# Use Shared VPC
network = "projects/${var.host_project}/global/networks/shared-vpc"
subnetwork = "projects/${var.host_project}/regions/us-central1/subnetworks/subnet-us-central1"
# Private cluster configuration
private_cluster_config {
enable_private_nodes = true
enable_private_endpoint = false # Allow kubectl from authorized networks
master_ipv4_cidr_block = "172.16.0.0/28"
}
# Use secondary ranges for pods/services
ip_allocation_policy {
cluster_secondary_range_name = "pods"
services_secondary_range_name = "services"
}
# Authorized networks for master access
master_authorized_networks_config {
cidr_blocks {
cidr_block = "10.0.0.0/8"
display_name = "Internal VPC"
}
cidr_blocks {
cidr_block = var.admin_cidr
display_name = "Admin Access"
}
}
# Workload Identity for pod service accounts
workload_identity_config {
workload_pool = "${var.service_project}.svc.id.goog"
}
# VPC-native cluster
networking_mode = "VPC_NATIVE"
}Step 5: Implement Hierarchical Firewall Policies
# Organization-level firewall policy
resource "google_compute_firewall_policy" "org_policy" {
short_name = "org-security-policy"
parent = "organizations/${var.org_id}"
}
# Deny all ingress by default
resource "google_compute_firewall_policy_rule" "deny_ingress_default" {
firewall_policy = google_compute_firewall_policy.org_policy.id
priority = 65534
action = "deny"
direction = "INGRESS"
match {
layer4_configs {
ip_protocol = "all"
}
}
}
# Allow internal communication
resource "google_compute_firewall_policy_rule" "allow_internal" {
firewall_policy = google_compute_firewall_policy.org_policy.id
priority = 1000
action = "allow"
direction = "INGRESS"
match {
src_ip_ranges = ["10.0.0.0/8"]
layer4_configs {
ip_protocol = "all"
}
}
}
# Allow GCP health checks
resource "google_compute_firewall_policy_rule" "allow_health_checks" {
firewall_policy = google_compute_firewall_policy.org_policy.id
priority = 1001
action = "allow"
direction = "INGRESS"
match {
src_ip_ranges = [
"35.191.0.0/16", # Health check ranges
"130.211.0.0/22"
]
layer4_configs {
ip_protocol = "tcp"
}
}
}
# VPC-level firewall rules for specific needs
resource "google_compute_firewall" "allow_iap" {
name = "allow-iap-ssh"
project = var.host_project
network = google_compute_network.main.name
allow {
protocol = "tcp"
ports = ["22"]
}
source_ranges = ["35.235.240.0/20"] # IAP range
target_tags = ["allow-iap"]
}Step 6: VPC Service Controls (Data Exfiltration Prevention)
# Service perimeter for sensitive data
resource "google_access_context_manager_service_perimeter" "healthcare" {
parent = "accessPolicies/${var.access_policy_id}"
name = "accessPolicies/${var.access_policy_id}/servicePerimeters/healthcare"
title = "Healthcare Data Perimeter"
status {
resources = [
"projects/${var.prod_project_number}"
]
restricted_services = [
"storage.googleapis.com",
"bigquery.googleapis.com",
"healthcare.googleapis.com"
]
# Allow access from VPC
vpc_accessible_services {
enable_restriction = true
allowed_services = ["RESTRICTED-SERVICES"]
}
ingress_policies {
ingress_from {
sources {
access_level = google_access_context_manager_access_level.corp_network.name
}
}
ingress_to {
resources = ["*"]
operations {
service_name = "storage.googleapis.com"
method_selectors {
method = "*"
}
}
}
}
}
} Network Design Summary
| Component | Purpose | Configuration |
|---|---|---|
| Shared VPC | Centralized network management | Host + service projects |
| Private Google Access | Access GCP APIs without public IPs | Enabled on subnets |
| Cloud NAT | Controlled outbound internet | Per-region with logging |
| Private GKE | No public IPs on nodes | Private nodes + authorized networks |
| Firewall Policies | Hierarchical security rules | Org → Folder → Project |
| VPC Service Controls | Data exfiltration prevention | Service perimeters |
Practice Question
Why is Private Google Access required for private GKE clusters to function properly?