DeployU
Interviews / Cloud & DevOps / Docker builds are taking 10 minutes. Optimize the Dockerfile to use layer caching effectively.

Docker builds are taking 10 minutes. Optimize the Dockerfile to use layer caching effectively.

practical Build Optimization Interactive Quiz Code Examples

The Scenario

Your CI/CD pipeline builds this Docker image on every commit:

FROM node:18
WORKDIR /app
COPY . .
RUN npm install
RUN npm run build
CMD ["node", "dist/server.js"]

Current build metrics:

  • Average build time: 10 minutes
  • npm install runs every build (5 minutes)
  • Any file change invalidates all layers
  • CI costs are increasing rapidly

The Challenge

Optimize this Dockerfile to take advantage of Docker’s layer caching. Explain how layer caching works and why the order of instructions matters.

Wrong Approach

A junior engineer might add --cache-from flag hoping it magically helps, try to combine all RUN commands into one layer, or not understand why COPY order matters. These fail because --cache-from requires proper cache management, combining layers can actually hurt caching, and the key insight about dependency caching is missed.

Right Approach

A senior engineer understands that Docker caches each layer independently. If a layer changes, all subsequent layers are rebuilt. The optimization strategy is to order instructions from least-frequently-changed to most-frequently-changed. This means: 1) Copy dependency files first, 2) Install dependencies, 3) Then copy source code. This way, source code changes don't invalidate the expensive dependency installation layer.

Understanding Docker Layer Caching

Each instruction creates a layer:
FROM node:18           → Layer 1 (base image)
WORKDIR /app           → Layer 2 (cached unless base changes)
COPY . .               → Layer 3 (invalidated on ANY file change!)
RUN npm install        → Layer 4 (rebuilt because layer 3 changed)
RUN npm run build      → Layer 5 (rebuilt because layer 4 changed)

The problem: COPY . . invalidates the cache whenever ANY file changes, forcing npm install to run every time.

Step 1: Separate Dependencies from Source Code

FROM node:18-alpine
WORKDIR /app

# Step 1: Copy ONLY package files (changes rarely)
COPY package.json package-lock.json ./

# Step 2: Install dependencies (cached if package*.json unchanged)
RUN npm ci

# Step 3: Copy source code (changes frequently)
COPY . .

# Step 4: Build (only runs if source changed)
RUN npm run build

CMD ["node", "dist/server.js"]

Result: If only source code changes, Docker uses cached npm install layer!

Step 2: Add .dockerignore

# .dockerignore - prevent unnecessary cache invalidation
node_modules
npm-debug.log
.git
.gitignore
*.md
Dockerfile*
.dockerignore
coverage
.env*
dist
.nyc_output

Step 3: Use BuildKit for Parallel Builds

# syntax=docker/dockerfile:1.4
FROM node:18-alpine AS base
WORKDIR /app

# Dependencies stage
FROM base AS deps
COPY package.json package-lock.json ./
RUN --mount=type=cache,target=/root/.npm \
    npm ci

# Build stage
FROM base AS builder
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN npm run build

# Production stage
FROM base AS runner
COPY --from=builder /app/dist ./dist
COPY --from=deps /app/node_modules ./node_modules
CMD ["node", "dist/server.js"]

Enable BuildKit:

DOCKER_BUILDKIT=1 docker build -t myapp .

Step 4: Cache npm Downloads

# syntax=docker/dockerfile:1.4
FROM node:18-alpine
WORKDIR /app

COPY package.json package-lock.json ./

# Mount cache for npm packages
RUN --mount=type=cache,target=/root/.npm \
    npm ci --prefer-offline

COPY . .
RUN npm run build

CMD ["node", "dist/server.js"]

BuildKit cache mounts:

  • --mount=type=cache persists directory across builds
  • npm/yarn don’t re-download unchanged packages
  • Significant speedup for dependency installation

Layer Caching Best Practices

InstructionCache InvalidationOptimization
COPY . .Any file changeSplit into multiple COPYs
RUN npm installpackage.json changeCopy package*.json first
RUN apt-get updateAlways re-runCombine with install in one layer
ARG VERSIONValue changePut after static layers
ENVValue changePut late if dynamic

Optimized Build Order

# syntax=docker/dockerfile:1.4
FROM node:18-alpine AS builder

# 1. Install system dependencies (rarely changes)
RUN apk add --no-cache python3 make g++

WORKDIR /app

# 2. Copy dependency manifests (changes weekly)
COPY package.json package-lock.json ./

# 3. Install dependencies with cache mount
RUN --mount=type=cache,target=/root/.npm \
    npm ci

# 4. Copy configuration files (changes monthly)
COPY tsconfig.json ./

# 5. Copy source code (changes on every commit)
COPY src ./src

# 6. Build application
RUN npm run build

# Production stage
FROM node:18-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
USER node
CMD ["node", "dist/server.js"]

CI/CD Cache Configuration

# GitHub Actions example
- name: Set up Docker Buildx
  uses: docker/setup-buildx-action@v3

- name: Build and push
  uses: docker/build-push-action@v5
  with:
    context: .
    push: true
    tags: myapp:latest
    cache-from: type=gha
    cache-to: type=gha,mode=max

Measuring Build Performance

# Time the build
time docker build -t myapp .

# Show layer sizes
docker history myapp

# Analyze with dive
dive myapp

Practice Question

Why should you COPY package.json before COPY . . in a Node.js Dockerfile?