Questions
Design a Docker Compose configuration for a microservices application with proper networking, health checks, and dependencies.
The Scenario
You’re containerizing a microservices application with:
- API Gateway (Node.js) - receives all traffic, routes to services
- User Service (Python) - handles authentication
- Order Service (Go) - processes orders
- PostgreSQL - user data
- Redis - session cache
- RabbitMQ - async messaging
Current issues with the basic setup:
- Services start before databases are ready
- No health monitoring
- All services on same network (security concern)
- No resource limits (one service can starve others)
The Challenge
Design a production-ready Docker Compose configuration that handles service dependencies, network isolation, health checks, and resource management.
A junior engineer might use depends_on without health checks, put all services on the default network, skip resource limits, hardcode credentials in the compose file, and not consider graceful startup order. This leads to race conditions, security vulnerabilities, resource exhaustion, and leaked secrets.
A senior engineer implements defense in depth: use depends_on with service_healthy condition, separate networks for different tiers, proper health checks for all services, resource limits, externalized secrets, and proper logging configuration. Each layer addresses specific production concerns.
Step 1: Basic Structure with Health Checks
version: '3.8'
services:
# API Gateway - Entry point
api-gateway:
build: ./api-gateway
ports:
- "8080:8080"
environment:
- USER_SERVICE_URL=http://user-service:3000
- ORDER_SERVICE_URL=http://order-service:4000
healthcheck:
test: ["CMD", "wget", "-q", "--spider", "http://localhost:8080/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
depends_on:
user-service:
condition: service_healthy
order-service:
condition: service_healthyStep 2: Network Isolation
networks:
# Public network - only API gateway exposed
frontend:
driver: bridge
# Internal network for services
backend:
driver: bridge
internal: true # No external access
# Database network - maximum isolation
database:
driver: bridge
internal: true
services:
api-gateway:
networks:
- frontend
- backend
user-service:
networks:
- backend
- database
order-service:
networks:
- backend
- database
postgres:
networks:
- database # Only accessible from backend services
redis:
networks:
- backend # Cache accessible from backend onlyStep 3: Complete Production Configuration
version: '3.8'
services:
# ============================================
# API Gateway
# ============================================
api-gateway:
build:
context: ./api-gateway
dockerfile: Dockerfile
ports:
- "8080:8080"
environment:
- NODE_ENV=production
- USER_SERVICE_URL=http://user-service:3000
- ORDER_SERVICE_URL=http://order-service:4000
- REDIS_URL=redis://redis:6379
healthcheck:
test: ["CMD", "wget", "-q", "--spider", "http://localhost:8080/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
depends_on:
user-service:
condition: service_healthy
order-service:
condition: service_healthy
redis:
condition: service_healthy
deploy:
resources:
limits:
cpus: '0.5'
memory: 256M
reservations:
cpus: '0.25'
memory: 128M
networks:
- frontend
- backend
restart: unless-stopped
logging:
driver: json-file
options:
max-size: "10m"
max-file: "3"
# ============================================
# User Service
# ============================================
user-service:
build: ./user-service
environment:
- DATABASE_URL=postgres://user:${DB_PASSWORD}@postgres:5432/users
- REDIS_URL=redis://redis:6379
- RABBITMQ_URL=amqp://rabbitmq:5672
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 30s
depends_on:
postgres:
condition: service_healthy
redis:
condition: service_healthy
rabbitmq:
condition: service_healthy
deploy:
resources:
limits:
cpus: '0.5'
memory: 512M
networks:
- backend
- database
restart: unless-stopped
# ============================================
# Order Service
# ============================================
order-service:
build: ./order-service
environment:
- DATABASE_URL=postgres://order:${DB_PASSWORD}@postgres:5432/orders
- RABBITMQ_URL=amqp://rabbitmq:5672
healthcheck:
test: ["CMD", "/app/healthcheck"]
interval: 30s
timeout: 10s
retries: 3
depends_on:
postgres:
condition: service_healthy
rabbitmq:
condition: service_healthy
deploy:
resources:
limits:
cpus: '1.0'
memory: 256M
networks:
- backend
- database
restart: unless-stopped
# ============================================
# PostgreSQL
# ============================================
postgres:
image: postgres:15-alpine
environment:
- POSTGRES_PASSWORD=${DB_PASSWORD}
- POSTGRES_MULTIPLE_DATABASES=users,orders
volumes:
- postgres-data:/var/lib/postgresql/data
- ./init-db.sh:/docker-entrypoint-initdb.d/init-db.sh
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 10s
timeout: 5s
retries: 5
deploy:
resources:
limits:
cpus: '1.0'
memory: 1G
networks:
- database
restart: unless-stopped
# ============================================
# Redis
# ============================================
redis:
image: redis:7-alpine
command: redis-server --appendonly yes
volumes:
- redis-data:/data
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 5s
retries: 5
deploy:
resources:
limits:
cpus: '0.5'
memory: 256M
networks:
- backend
restart: unless-stopped
# ============================================
# RabbitMQ
# ============================================
rabbitmq:
image: rabbitmq:3-management-alpine
environment:
- RABBITMQ_DEFAULT_USER=admin
- RABBITMQ_DEFAULT_PASS=${RABBITMQ_PASSWORD}
volumes:
- rabbitmq-data:/var/lib/rabbitmq
healthcheck:
test: ["CMD", "rabbitmq-diagnostics", "-q", "ping"]
interval: 30s
timeout: 10s
retries: 5
deploy:
resources:
limits:
cpus: '0.5'
memory: 512M
networks:
- backend
restart: unless-stopped
# ============================================
# Networks
# ============================================
networks:
frontend:
driver: bridge
backend:
driver: bridge
internal: true
database:
driver: bridge
internal: true
# ============================================
# Volumes
# ============================================
volumes:
postgres-data:
redis-data:
rabbitmq-data: Dependency Management
# Wrong - just waits for container to start
depends_on:
- postgres
# Right - waits for health check to pass
depends_on:
postgres:
condition: service_healthy
Health Check Patterns
| Service Type | Health Check Command |
|---|---|
| HTTP API | wget -q --spider http://localhost:PORT/health |
| PostgreSQL | pg_isready -U postgres |
| Redis | redis-cli ping |
| RabbitMQ | rabbitmq-diagnostics -q ping |
| MongoDB | mongosh --eval "db.adminCommand('ping')" |
| Custom binary | /app/healthcheck |
Practice Question
Why should you use 'condition: service_healthy' instead of just 'depends_on: service_name'?