Design a deployment workflow with environment approvals, staging, and production rollbacks.

Q: Design a deployment workflow with environment approvals, staging, and production rollbacks.

Learn the answer to "Design a deployment workflow with environment approvals, staging, and production rollbacks." with detailed explanations, code examples, and best practices on DeployU.

The Scenario

Your deployment process needs enterprise controls:

Requirements:
- Deploy to staging automatically on main branch push
- Require approval from 2 team leads for production
- Run smoke tests before promoting to production
- Enable one-click rollback to any previous version
- Track who deployed what and when
- Prevent deployments during maintenance windows

The Challenge

Design a comprehensive deployment workflow with proper environment gates, approvals, rollback capabilities, and audit trails.

Wrong Approach

A junior engineer might deploy directly to production without gates, use manual kubectl commands for rollback, or skip staging entirely for hot fixes. These approaches risk production outages, make rollbacks error-prone, and bypass safety checks.

Addresses symptoms, not root cause

Right Approach

A senior engineer implements GitHub Environments with protection rules, uses deployment_status events for orchestration, builds automated rollback mechanisms, and creates comprehensive audit logging.

Step 1: Configure GitHub Environments

# Configure in Repository Settings > Environments

# staging:
#   - No required reviewers
#   - Deployment branches: main

# production:
#   - Required reviewers: 2 from @org/release-managers
#   - Wait timer: 5 minutes
#   - Deployment branches: main only
#   - Environment secrets: PROD_* credentials

Step 2: Multi-Stage Deployment Workflow

name: Deploy

on:
  push:
    branches: [main]
  workflow_dispatch:
    inputs:
      environment:
        description: 'Target environment'
        required: true
        type: choice
        options:
          - staging
          - production
      version:
        description: 'Version to deploy (for rollback)'
        required: false
        type: string

permissions:
  contents: read
  id-token: write
  deployments: write

jobs:
  build:
    runs-on: ubuntu-latest
    outputs:
      image: ${{ steps.build.outputs.image }}
      version: ${{ steps.version.outputs.version }}
    steps:
      - uses: actions/checkout@v4

      - name: Generate version
        id: version
        run: |
          VERSION="${{ inputs.version || github.sha }}"
          echo "version=$VERSION" >> $GITHUB_OUTPUT

      - name: Build and push
        id: build
        run: |
          IMAGE="ghcr.io/${{ github.repository }}:${{ steps.version.outputs.version }}"
          docker build -t $IMAGE .
          docker push $IMAGE
          echo "image=$IMAGE" >> $GITHUB_OUTPUT

  deploy-staging:
    needs: build
    if: github.event_name == 'push' || inputs.environment == 'staging'
    runs-on: ubuntu-latest
    environment:
      name: staging
      url: https://staging.app.example.com

    steps:
      - uses: actions/checkout@v4

      - name: Configure AWS
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: ${{ secrets.AWS_ROLE_ARN }}
          aws-region: us-east-1

      - name: Deploy to staging
        id: deploy
        run: |
          aws eks update-kubeconfig --name staging-cluster
          kubectl set image deployment/app app=${{ needs.build.outputs.image }} -n app
          kubectl rollout status deployment/app -n app --timeout=300s

      - name: Run smoke tests
        run: |
          # Wait for deployment to be ready
          sleep 30
          curl -sf https://staging.app.example.com/health || exit 1
          npm run test:smoke -- --env=staging

      - name: Record deployment
        if: always()
        run: |
          echo '{
            "environment": "staging",
            "version": "${{ needs.build.outputs.version }}",
            "image": "${{ needs.build.outputs.image }}",
            "status": "${{ job.status }}",
            "timestamp": "'$(date -u +%Y-%m-%dT%H:%M:%SZ)'",
            "actor": "${{ github.actor }}",
            "run_id": "${{ github.run_id }}"
          }' | tee deployment-record.json

          # Store in deployment history
          aws s3 cp deployment-record.json \
            s3://deployments-bucket/staging/${{ github.run_id }}.json

  deploy-production:
    needs: [build, deploy-staging]
    if: success() && (github.event_name == 'push' || inputs.environment == 'production')
    runs-on: ubuntu-latest
    environment:
      name: production
      url: https://app.example.com

    steps:
      - uses: actions/checkout@v4

      - name: Configure AWS
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: ${{ secrets.AWS_ROLE_ARN_PROD }}
          aws-region: us-east-1

      - name: Pre-deployment checks
        run: |
          # Check maintenance window
          HOUR=$(date +%H)
          DAY=$(date +%u)
          if [[ $HOUR -ge 22 || $HOUR -lt 6 ]] && [[ $DAY -le 5 ]]; then
            echo "::error::Deployments blocked during maintenance window (10PM-6AM weekdays)"
            exit 1
          fi

          # Check for active incidents
          curl -sf https://status.example.com/api/incidents/active | jq -e '.count == 0' || {
            echo "::error::Cannot deploy during active incident"
            exit 1
          }

      - name: Deploy to production
        run: |
          aws eks update-kubeconfig --name production-cluster

          # Record pre-deployment state for rollback
          kubectl get deployment/app -n app -o json > pre-deploy-state.json

          # Deploy with canary strategy
          kubectl set image deployment/app app=${{ needs.build.outputs.image }} -n app
          kubectl rollout status deployment/app -n app --timeout=600s

      - name: Post-deployment verification
        run: |
          # Health check
          for i in {1..5}; do
            curl -sf https://app.example.com/health && break
            sleep 10
          done

          # Smoke tests
          npm run test:smoke -- --env=production

      - name: Record successful deployment
        run: |
          echo '{
            "environment": "production",
            "version": "${{ needs.build.outputs.version }}",
            "image": "${{ needs.build.outputs.image }}",
            "status": "success",
            "timestamp": "'$(date -u +%Y-%m-%dT%H:%M:%SZ)'",
            "actor": "${{ github.actor }}",
            "approvers": "${{ github.event.review.user.login || 'auto' }}",
            "run_id": "${{ github.run_id }}"
          }' | aws s3 cp - s3://deployments-bucket/production/${{ github.run_id }}.json

Step 3: Rollback Workflow

name: Rollback

on:
  workflow_dispatch:
    inputs:
      environment:
        description: 'Environment to rollback'
        required: true
        type: choice
        options:
          - staging
          - production
      target_version:
        description: 'Version to rollback to (leave empty for previous)'
        required: false
        type: string
      reason:
        description: 'Reason for rollback'
        required: true
        type: string

permissions:
  contents: read
  id-token: write
  deployments: write

jobs:
  rollback:
    runs-on: ubuntu-latest
    environment: ${{ inputs.environment }}

    steps:
      - name: Configure AWS
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: ${{ secrets.AWS_ROLE_ARN }}
          aws-region: us-east-1

      - name: Determine rollback target
        id: target
        run: |
          if [ -n "${{ inputs.target_version }}" ]; then
            echo "version=${{ inputs.target_version }}" >> $GITHUB_OUTPUT
          else
            # Get previous successful deployment
            PREV=$(aws s3 ls s3://deployments-bucket/${{ inputs.environment }}/ | \
              sort -r | head -2 | tail -1 | awk '{print $4}')
            VERSION=$(aws s3 cp s3://deployments-bucket/${{ inputs.environment }}/$PREV - | jq -r '.version')
            echo "version=$VERSION" >> $GITHUB_OUTPUT
          fi

      - name: Perform rollback
        run: |
          aws eks update-kubeconfig --name ${{ inputs.environment }}-cluster

          IMAGE="ghcr.io/${{ github.repository }}:${{ steps.target.outputs.version }}"
          kubectl set image deployment/app app=$IMAGE -n app
          kubectl rollout status deployment/app -n app --timeout=300s

      - name: Verify rollback
        run: |
          curl -sf https://${{ inputs.environment == 'production' && 'app' || 'staging.app' }}.example.com/health

      - name: Record rollback
        run: |
          echo '{
            "type": "rollback",
            "environment": "${{ inputs.environment }}",
            "from_version": "current",
            "to_version": "${{ steps.target.outputs.version }}",
            "reason": "${{ inputs.reason }}",
            "actor": "${{ github.actor }}",
            "timestamp": "'$(date -u +%Y-%m-%dT%H:%M:%SZ)'"
          }' | aws s3 cp - s3://deployments-bucket/rollbacks/${{ github.run_id }}.json

      - name: Notify team
        uses: slackapi/slack-github-action@v1
        with:
          webhook: ${{ secrets.SLACK_WEBHOOK }}
          webhook-type: incoming-webhook
          payload: |
            {
              "text": "Rollback completed",
              "blocks": [
                {
                  "type": "section",
                  "text": {
                    "type": "mrkdwn",
                    "text": "*Rollback to ${{ inputs.environment }}*\nVersion: ${{ steps.target.outputs.version }}\nReason: ${{ inputs.reason }}\nBy: ${{ github.actor }}"
                  }
                }
              ]
            }

Step 4: Blue-Green Deployment

name: Blue-Green Deploy

on:
  workflow_dispatch:
    inputs:
      environment:
        type: choice
        options: [staging, production]

jobs:
  deploy:
    runs-on: ubuntu-latest
    environment: ${{ inputs.environment }}
    steps:
      - uses: actions/checkout@v4

      - name: Configure AWS
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: ${{ secrets.AWS_ROLE_ARN }}
          aws-region: us-east-1

      - name: Get current deployment color
        id: current
        run: |
          aws eks update-kubeconfig --name ${{ inputs.environment }}-cluster
          CURRENT=$(kubectl get service/app -n app -o jsonpath='{.spec.selector.color}')
          if [ "$CURRENT" == "blue" ]; then
            echo "current=blue" >> $GITHUB_OUTPUT
            echo "target=green" >> $GITHUB_OUTPUT
          else
            echo "current=green" >> $GITHUB_OUTPUT
            echo "target=blue" >> $GITHUB_OUTPUT
          fi

      - name: Deploy to inactive color
        run: |
          IMAGE="ghcr.io/${{ github.repository }}:${{ github.sha }}"
          kubectl set image deployment/app-${{ steps.current.outputs.target }} \
            app=$IMAGE -n app
          kubectl rollout status deployment/app-${{ steps.current.outputs.target }} \
            -n app --timeout=300s

      - name: Test inactive deployment
        run: |
          # Test the inactive deployment directly
          POD=$(kubectl get pod -n app -l color=${{ steps.current.outputs.target }} -o jsonpath='{.items[0].metadata.name}')
          kubectl port-forward $POD 8080:8080 -n app &
          sleep 5
          curl -sf http://localhost:8080/health

      - name: Switch traffic
        run: |
          kubectl patch service/app -n app \
            -p '{"spec":{"selector":{"color":"${{ steps.current.outputs.target }}"}}}'

      - name: Verify switch
        run: |
          sleep 10
          curl -sf https://app.example.com/health
          npm run test:smoke

      - name: Keep old deployment for quick rollback
        run: |
          echo "Previous deployment (${{ steps.current.outputs.current }}) kept for rollback"
          echo "To rollback, run: kubectl patch service/app -n app -p '{\"spec\":{\"selector\":{\"color\":\"${{ steps.current.outputs.current }}\"}}}'"

Step 5: Deployment Dashboard Data

name: Update Deployment Dashboard

on:
  deployment_status:

jobs:
  update-dashboard:
    runs-on: ubuntu-latest
    steps:
      - name: Update deployment metrics
        run: |
          # Send metrics to monitoring system
          curl -X POST https://metrics.example.com/deployments \
            -H "Content-Type: application/json" \
            -d '{
              "repository": "${{ github.repository }}",
              "environment": "${{ github.event.deployment.environment }}",
              "status": "${{ github.event.deployment_status.state }}",
              "sha": "${{ github.event.deployment.sha }}",
              "creator": "${{ github.event.deployment.creator.login }}",
              "timestamp": "${{ github.event.deployment_status.created_at }}"
            }'

Systematic, production-ready debugging

Deployment Pattern Comparison

Pattern	Downtime	Rollback Speed	Resource Usage
Rolling	Zero	Minutes	1x
Blue-Green	Zero	Seconds	2x
Canary	Zero	Seconds	1.1x
Recreate	Yes	Minutes	1x

Practice Question

What is the purpose of the 'wait timer' protection rule in GitHub Environments?

Questions