Healthchecks + signals cheatsheet.
HEALTHCHECK
HEALTHCHECK --interval=30s \
--timeout=5s \
--start-period=30s \
--retries=3 \
CMD curl -fsS http://localhost:8000/health || exit 1
docker inspect --format='{{.State.Health.Status}}' web
# starting, healthy, unhealthy
States: starting (during start-period) → healthy / unhealthy.
Without curl
HEALTHCHECK CMD wget --spider -q http://localhost/health || exit 1
# Or Python:
HEALTHCHECK CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost/health')" || exit 1
# Or netcat:
HEALTHCHECK CMD nc -z localhost 8000 || exit 1
Compose
services:
web:
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost/health"]
interval: 30s
timeout: 5s
retries: 3
start_period: 30s
test: array (CMD) or string (CMD-SHELL).
depends_on with healthcheck
services:
web:
depends_on:
db:
condition: service_healthy
db:
image: postgres
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 5s
retries: 5
Web waits until DB is healthy.
Disable healthcheck
HEALTHCHECK NONE
Signals: SIGTERM → SIGKILL
When you docker stop:
- Docker sends SIGTERM to PID 1.
- Waits 10s (default,
--time). - Sends SIGKILL.
For graceful shutdown, your app must:
- Receive SIGTERM.
- Stop accepting new requests.
- Finish in-flight work.
- Exit cleanly.
exec form vs shell form
# Shell form: CMD wraps in `sh -c`, so app is NOT PID 1
CMD python app.py
# Exec form: app IS PID 1
CMD ["python", "app.py"]
Always prefer exec form. Shell form swallows signals.
ENTRYPOINT script
ENTRYPOINT ["/entrypoint.sh"]
#!/bin/sh
set -e
# Setup
echo "starting…"
# Exec to make app PID 1
exec "$@"
CMD args appended to ENTRYPOINT.
tini (init for PID 1)
If your app doesn’t reap zombies or handle signals well:
RUN apt-get install -y tini
ENTRYPOINT ["/usr/bin/tini", "--"]
CMD ["python", "app.py"]
Or Docker’s built-in:
docker run --init app
services:
web:
init: true
Stop timeout
docker stop --time=30 web
STOPSIGNAL SIGINT # default SIGTERM
services:
web:
stop_grace_period: 30s
stop_signal: SIGINT
Graceful shutdown patterns
Python (gunicorn)
gunicorn config.wsgi --graceful-timeout 30
gunicorn drains workers on SIGTERM.
Node.js
const server = app.listen(3000);
function shutdown() {
console.log("SIGTERM received, shutting down");
server.close(() => {
console.log("server closed");
process.exit(0);
});
setTimeout(() => process.exit(1), 30000);
}
process.on("SIGTERM", shutdown);
process.on("SIGINT", shutdown);
Go
ctx, cancel := signal.NotifyContext(context.Background(), syscall.SIGTERM, syscall.SIGINT)
defer cancel()
server := &http.Server{...}
go server.ListenAndServe()
<-ctx.Done()
shutdownCtx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
server.Shutdown(shutdownCtx)
Liveness vs readiness
Docker has one healthcheck. K8s distinguishes:
- Liveness: am I alive? Failed → restart container.
- Readiness: am I ready to serve? Failed → remove from service.
Docker’s HEALTHCHECK is more like liveness.
Common healthcheck endpoint
# Flask / FastAPI / Django
@app.get("/health")
def health():
return {"ok": True}
# Deeper: check DB / cache
@app.get("/health/ready")
def ready():
try:
db.execute("SELECT 1")
return {"ok": True}
except Exception:
raise HTTPException(503)
/health = liveness (cheap), /health/ready = readiness (more checks).
Logging health probes
Skip them in access logs:
location = /health {
access_log off;
return 200 "ok";
}
Don’t fill logs with periodic probes.
Common mistakes
- Shell form CMD → SIGTERM doesn’t reach app.
- 10s default stop time too short for batch jobs.
- HEALTHCHECK calls expensive endpoint → app load.
- HEALTHCHECK without
start_period→ unhealthy during slow boot. - App doesn’t handle SIGTERM → 10s wait then SIGKILL → dropped requests.
Read this next
If you want my graceful shutdown templates (Python/Node/Go), they’re at rajpoot.dev .
Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .