Questions
A container keeps exiting immediately after starting. How do you debug it?
The Scenario
You deploy a new version of your API service and the container keeps crashing:
$ docker ps -a
CONTAINER ID IMAGE STATUS NAMES
a1b2c3d4e5f6 api:v2.1.0 Exited (1) 2 seconds ago api-service
Every time you start it, it exits within seconds. The previous version (v2.0.0) works fine. Production is partially down and you need to fix this quickly.
The Challenge
Walk through your systematic debugging process. What commands would you run, in what order, and why? How do you identify whether this is a code issue, configuration problem, or missing dependency?
A junior engineer might immediately try docker run with -it flag hoping to see output, rebuild the image thinking the build failed, check if the port is already in use, or just rollback without understanding the issue. This fails because -it won't help if the process crashes before any output, rebuilding without diagnosing repeats the problem, port conflicts show different errors, and rolling back doesn't prevent the issue from recurring.
A senior engineer follows a systematic approach: first check exit codes and container logs, then inspect the container configuration, compare with working version, and if needed, override the entrypoint to get shell access for live debugging. Exit codes reveal the failure type, and logs usually contain the actual error message.
Step 1: Check Exit Code and Logs (First 30 seconds)
# Check the exit code
docker inspect api-service --format='{{.State.ExitCode}}'
# Exit code 1 = application error
# Exit code 137 = OOM killed (128 + 9 SIGKILL)
# Exit code 139 = Segmentation fault
# Exit code 143 = SIGTERM (graceful shutdown)
# Get container logs
docker logs api-service
# If container restarts too fast, get logs from last run
docker logs api-service 2>&1 | tail -100Common log patterns:
Error: Cannot find module 'express'→ Missing dependencyEADDRINUSE→ Port already in usepermission denied→ File/directory access issueECONNREFUSED→ Can’t reach database/dependency
Step 2: Inspect Container Configuration
# Check full container details
docker inspect api-service
# Key things to look for:
docker inspect api-service --format='
CMD: {{.Config.Cmd}}
Entrypoint: {{.Config.Entrypoint}}
Env: {{range .Config.Env}}{{.}} {{end}}
WorkingDir: {{.Config.WorkingDir}}
'
# Compare with working version
docker inspect api:v2.0.0 --format='{{.Config.Cmd}}'
docker inspect api:v2.1.0 --format='{{.Config.Cmd}}'Step 3: Get Shell Access for Live Debugging
# Override entrypoint to get shell access
docker run -it --entrypoint /bin/sh api:v2.1.0
# Inside the container, manually run the command
$ node server.js
# Now you'll see the actual error in real-time
# Check if files exist
$ ls -la /app
$ cat /app/package.json
# Check environment variables
$ env | grep -i databaseStep 4: Check Resource Constraints
# Was it killed due to memory limits?
docker inspect api-service --format='{{.State.OOMKilled}}'
# Check memory limit vs usage
docker stats api-service --no-stream
# Check if there are resource limits
docker inspect api-service --format='
Memory Limit: {{.HostConfig.Memory}}
CPU Shares: {{.HostConfig.CpuShares}}
'Step 5: Compare Image Layers
# Check what changed between versions
docker history api:v2.0.0
docker history api:v2.1.0
# Use dive tool for detailed analysis
dive api:v2.1.0 Common Root Causes and Fixes
| Exit Code | Meaning | Common Causes | Fix |
|---|---|---|---|
| 0 | Success | CMD finished (not a daemon) | Ensure process runs in foreground |
| 1 | General error | Application crash, missing config | Check logs for specific error |
| 126 | Permission denied | Script not executable | chmod +x script.sh |
| 127 | Command not found | Wrong CMD/ENTRYPOINT path | Verify binary exists in image |
| 137 | SIGKILL (OOM) | Memory limit exceeded | Increase memory or fix leak |
| 139 | SIGSEGV | Segmentation fault | Debug application code |
| 143 | SIGTERM | Graceful shutdown | Check why container was stopped |
Real-World Example: Missing Environment Variable
Logs show:
Error: DATABASE_URL environment variable is required
at validateConfig (/app/dist/config.js:15:11)
at Object.<anonymous> (/app/dist/server.js:3:1)
Root Cause: New version added database URL validation, but the environment variable wasn’t set in the docker run command.
Fix:
# Add the missing environment variable
docker run -d \
-e DATABASE_URL=postgres://user:pass@db:5432/myapp \
--name api-service \
api:v2.1.0
Debugging Quick Reference
# Full debugging workflow
docker logs <container> # Check logs
docker inspect <container> --format='{{.State}}' # Check state
docker diff <container> # Check filesystem changes
docker top <container> # Check running processes
docker exec -it <container> /bin/sh # Get shell (if running)
docker run -it --entrypoint /bin/sh <image> # Override entrypoint
Practice Question
A container exits with code 137. What is the most likely cause?