DeployU
Interviews / DevOps & Cloud Infrastructure / Every workflow run downloads the same dependencies. Implement an effective caching strategy.

Every workflow run downloads the same dependencies. Implement an effective caching strategy.

practical Caching & Artifacts Interactive Quiz Code Examples

The Scenario

Your CI workflows are slow and expensive:

Workflow timing breakdown:
- Checkout: 10s
- npm install: 4m 30s (downloading 800MB)
- Build: 2m
- Test: 1m 30s
- Docker build: 3m (no layer caching)
Total: ~11 minutes

Issues:
- Same dependencies downloaded every run
- Build cache not preserved between runs
- Docker builds start from scratch
- Playwright browsers downloaded each time

The Challenge

Implement comprehensive caching that reduces workflow time by 60%+ while handling cache invalidation correctly.

Wrong Approach

A junior engineer might cache everything with a static key, cache node_modules directly without considering lock file changes, or set very long cache TTLs. These approaches lead to stale caches, wasted space, and inconsistent builds.

Right Approach

A senior engineer designs a cache hierarchy with proper keys based on content hashes, caches the right artifacts (npm cache, not node_modules), implements fallback keys for partial cache hits, and manages cache lifecycle.

Step 1: Cache npm Dependencies Properly

name: CI with Optimal Caching

on: [push, pull_request]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      # Option 1: Built-in caching with setup-node (recommended)
      - name: Setup Node.js with cache
        uses: actions/setup-node@v4
        with:
          node-version: 18
          cache: 'npm'  # Automatically caches ~/.npm

      - run: npm ci  # Uses cached packages

      # Option 2: Manual cache control (more flexibility)
      - name: Cache npm packages
        uses: actions/cache@v4
        id: npm-cache
        with:
          path: ~/.npm
          key: npm-${{ runner.os }}-${{ hashFiles('**/package-lock.json') }}
          restore-keys: |
            npm-${{ runner.os }}-

      - name: Install dependencies
        run: npm ci

      # Cache node_modules for monorepos (use carefully)
      - name: Cache node_modules
        uses: actions/cache@v4
        with:
          path: node_modules
          key: modules-${{ runner.os }}-${{ hashFiles('**/package-lock.json') }}
          # No restore-keys - we want exact match only for node_modules

Step 2: Cache Build Outputs

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: 18
          cache: 'npm'

      - run: npm ci

      # Cache Next.js build
      - name: Cache Next.js build
        uses: actions/cache@v4
        with:
          path: |
            .next/cache
          key: nextjs-${{ runner.os }}-${{ hashFiles('**/package-lock.json') }}-${{ hashFiles('src/**', 'app/**') }}
          restore-keys: |
            nextjs-${{ runner.os }}-${{ hashFiles('**/package-lock.json') }}-
            nextjs-${{ runner.os }}-

      - name: Build
        run: npm run build

      # Cache TypeScript build info
      - name: Cache TypeScript
        uses: actions/cache@v4
        with:
          path: |
            *.tsbuildinfo
            dist/**/*.tsbuildinfo
          key: tsc-${{ runner.os }}-${{ hashFiles('src/**/*.ts', 'tsconfig.json') }}

Step 3: Cache Docker Builds

jobs:
  docker:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      # Set up Docker Buildx for advanced caching
      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3

      # Option 1: GitHub Actions cache backend
      - name: Build with GHA cache
        uses: docker/build-push-action@v5
        with:
          context: .
          push: false
          tags: myapp:${{ github.sha }}
          cache-from: type=gha
          cache-to: type=gha,mode=max

      # Option 2: Registry cache (better for large images)
      - name: Build with registry cache
        uses: docker/build-push-action@v5
        with:
          context: .
          push: true
          tags: registry.example.com/myapp:${{ github.sha }}
          cache-from: type=registry,ref=registry.example.com/myapp:buildcache
          cache-to: type=registry,ref=registry.example.com/myapp:buildcache,mode=max

Step 4: Cache Playwright Browsers

jobs:
  e2e:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: 18
          cache: 'npm'

      - run: npm ci

      # Cache Playwright browsers
      - name: Get Playwright version
        id: playwright-version
        run: echo "version=$(npm ls @playwright/test --json | jq -r '.dependencies["@playwright/test"].version')" >> $GITHUB_OUTPUT

      - name: Cache Playwright browsers
        uses: actions/cache@v4
        id: playwright-cache
        with:
          path: ~/.cache/ms-playwright
          key: playwright-${{ runner.os }}-${{ steps.playwright-version.outputs.version }}

      - name: Install Playwright browsers
        if: steps.playwright-cache.outputs.cache-hit != 'true'
        run: npx playwright install --with-deps

      - name: Install Playwright deps only
        if: steps.playwright-cache.outputs.cache-hit == 'true'
        run: npx playwright install-deps

      - name: Run E2E tests
        run: npm run test:e2e

Step 5: Multi-Language Caching

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      # Python - pip cache
      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'
          cache: 'pip'

      # Go - module cache
      - name: Set up Go
        uses: actions/setup-go@v5
        with:
          go-version: '1.21'
          cache: true

      # Rust - cargo cache
      - name: Cache Rust
        uses: actions/cache@v4
        with:
          path: |
            ~/.cargo/bin/
            ~/.cargo/registry/index/
            ~/.cargo/registry/cache/
            ~/.cargo/git/db/
            target/
          key: cargo-${{ runner.os }}-${{ hashFiles('**/Cargo.lock') }}

      # Java - Maven cache
      - name: Set up JDK
        uses: actions/setup-java@v4
        with:
          java-version: '17'
          distribution: 'temurin'
          cache: 'maven'

      # Ruby - Bundler cache
      - name: Set up Ruby
        uses: ruby/setup-ruby@v1
        with:
          ruby-version: '3.2'
          bundler-cache: true

Step 6: Advanced Cache Patterns

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0  # Full history for cache key generation

      # Cache with branch awareness
      - name: Cache with branch fallback
        uses: actions/cache@v4
        with:
          path: |
            ~/.npm
            node_modules
          key: deps-${{ runner.os }}-${{ github.ref }}-${{ hashFiles('**/package-lock.json') }}
          restore-keys: |
            deps-${{ runner.os }}-${{ github.ref }}-
            deps-${{ runner.os }}-refs/heads/main-
            deps-${{ runner.os }}-

      # Conditional cache save
      - name: Cache test results
        uses: actions/cache@v4
        with:
          path: .test-cache
          key: test-${{ runner.os }}-${{ github.sha }}
          restore-keys: |
            test-${{ runner.os }}-
          # Only save on main branch to keep cache size manageable
          save-always: ${{ github.ref == 'refs/heads/main' }}

      # Cache with TTL simulation (using date in key)
      - name: Cache with weekly refresh
        uses: actions/cache@v4
        with:
          path: ~/.cache/heavy-deps
          key: heavy-${{ runner.os }}-week-${{ steps.date.outputs.week }}
          restore-keys: |
            heavy-${{ runner.os }}-week-

      - name: Get week number
        id: date
        run: echo "week=$(date +%Y-%W)" >> $GITHUB_OUTPUT

Step 7: Complete Optimized Workflow

name: Optimized CI

on:
  push:
    branches: [main]
  pull_request:

env:
  NODE_VERSION: 18
  CACHE_VERSION: v1  # Increment to invalidate all caches

jobs:
  build-and-test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Setup Node.js
        uses: actions/setup-node@v4
        with:
          node-version: ${{ env.NODE_VERSION }}
          cache: 'npm'

      # Dependency cache
      - name: Cache dependencies
        uses: actions/cache@v4
        id: deps-cache
        with:
          path: node_modules
          key: ${{ env.CACHE_VERSION }}-deps-${{ runner.os }}-${{ hashFiles('package-lock.json') }}

      - name: Install dependencies
        if: steps.deps-cache.outputs.cache-hit != 'true'
        run: npm ci

      # Build cache
      - name: Cache build
        uses: actions/cache@v4
        with:
          path: |
            .next/cache
            dist
          key: ${{ env.CACHE_VERSION }}-build-${{ runner.os }}-${{ hashFiles('src/**', 'package-lock.json') }}
          restore-keys: |
            ${{ env.CACHE_VERSION }}-build-${{ runner.os }}-

      - name: Build
        run: npm run build

      - name: Test
        run: npm test

  docker:
    needs: build-and-test
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: docker/setup-buildx-action@v3

      - uses: docker/build-push-action@v5
        with:
          context: .
          push: false
          tags: app:${{ github.sha }}
          cache-from: type=gha
          cache-to: type=gha,mode=max

Cache Strategy Comparison

Cache TargetKey StrategyRestore KeysSize
npm packageshashFiles('package-lock.json')OS prefix~500MB
node_modulesExact lock hash onlyNone (exact match)~800MB
Build outputSource + deps hashDeps hash, OS~100MB
Docker layersContent hashRegistry ref~2GB
PlaywrightPlaywright versionNone~400MB

Practice Question

Why should you cache ~/.npm (the npm cache directory) instead of node_modules directly?