Cheatsheet for shipping it. Long-form: Textbook Ch 12 .
Dockerfile (uv + multi-stage + distroless)
# syntax=docker/dockerfile:1.7
FROM python:3.13-slim AS builder
WORKDIR /app
RUN pip install --no-cache-dir uv
COPY pyproject.toml uv.lock ./
RUN --mount=type=cache,target=/root/.cache/uv \
uv sync --frozen --no-dev
FROM python:3.13-slim
WORKDIR /app
COPY --from=builder /app/.venv ./.venv
COPY src ./src
ENV PATH="/app/.venv/bin:$PATH" \
PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1
USER 1000:1000
EXPOSE 8000
CMD ["uvicorn", "src.myapp.main:app", "--host", "0.0.0.0", "--port", "8000", "--proxy-headers", "--forwarded-allow-ips=*"]
.dockerignore
.git
.venv
__pycache__
*.pyc
.env
.env.local
.pytest_cache
htmlcov
dist
.mypy_cache
.ruff_cache
node_modules
Uvicorn flags
uvicorn src.myapp.main:app \
--host 0.0.0.0 --port 8000 \
--workers 4 \
--proxy-headers --forwarded-allow-ips='*' \
--timeout-keep-alive 5 \
--timeout-graceful-shutdown 30 \
--limit-concurrency 500 \
--backlog 1024 \
--access-log
For K8s: usually 1 worker per container; let K8s scale horizontally.
Gunicorn + Uvicorn workers
gunicorn src.myapp.main:app \
-k uvicorn.workers.UvicornWorker \
-w 4 \
-b 0.0.0.0:8000 \
--timeout 60 \
--graceful-timeout 30 \
--keep-alive 5 \
--access-logfile -
K8s deployment
apiVersion: apps/v1
kind: Deployment
metadata: { name: myapi }
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate: { maxSurge: 1, maxUnavailable: 0 }
selector: { matchLabels: { app: myapi } }
template:
metadata: { labels: { app: myapi } }
spec:
terminationGracePeriodSeconds: 60
containers:
- name: api
image: ghcr.io/me/myapi:1.2.3
ports: [{ containerPort: 8000 }]
resources:
requests: { cpu: "200m", memory: "256Mi" }
limits: { memory: "512Mi" }
startupProbe:
httpGet: { path: /healthz, port: 8000 }
periodSeconds: 5
failureThreshold: 30
readinessProbe:
httpGet: { path: /ready, port: 8000 }
periodSeconds: 5
livenessProbe:
httpGet: { path: /healthz, port: 8000 }
periodSeconds: 10
envFrom:
- secretRef: { name: myapi-secrets }
- configMapRef: { name: myapi-config }
lifecycle:
preStop:
exec:
command: ["sh", "-c", "sleep 10"]
---
apiVersion: v1
kind: Service
metadata: { name: myapi }
spec:
selector: { app: myapi }
ports: [{ port: 80, targetPort: 8000 }]
HPA (CPU)
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata: { name: myapi }
spec:
scaleTargetRef: { apiVersion: apps/v1, kind: Deployment, name: myapi }
minReplicas: 3
maxReplicas: 30
metrics:
- type: Resource
resource:
name: cpu
target: { type: Utilization, averageUtilization: 70 }
Migration as Job (Helm)
apiVersion: batch/v1
kind: Job
metadata:
name: migrate-{{ .Values.image.tag }}
annotations:
helm.sh/hook: pre-install,pre-upgrade
helm.sh/hook-weight: "0"
spec:
template:
spec:
containers:
- name: migrate
image: ghcr.io/me/myapi:{{ .Values.image.tag }}
command: ["alembic", "upgrade", "head"]
envFrom: [{ secretRef: { name: myapi-secrets } }]
restartPolicy: OnFailure
backoffLimit: 3
Lifespan (graceful shutdown of pools)
from contextlib import asynccontextmanager
@asynccontextmanager
async def lifespan(app: FastAPI):
app.state.engine = create_async_engine(URL, pool_size=10, max_overflow=10, pool_pre_ping=True, pool_recycle=300)
app.state.sm = async_sessionmaker(app.state.engine, expire_on_commit=False)
app.state.http = httpx.AsyncClient(timeout=10)
yield
await app.state.http.aclose()
await app.state.engine.dispose()
app = FastAPI(lifespan=lifespan)
Behind a proxy (X-Forwarded-*)
uvicorn ... --proxy-headers --forwarded-allow-ips='*'
TLS
Terminate at LB / ingress. cert-manager + Let’s Encrypt for K8s.
Connection limits
uvicorn ... --limit-concurrency 500 --backlog 1024
Production checklist
- Non-root USER in Dockerfile.
- Pinned image digest.
- Health/readiness/startup probes.
- Memory limit set (CPU usually no limit).
- Resource requests sized via Goldilocks/VPA data.
- HPA configured.
- Secrets from secret manager (ESO + Vault) or platform secret.
- TLS at ingress.
- Structured JSON logs to stdout.
- OTEL traces exported.
- Prometheus metrics exposed.
- Sentry configured.
- Pre-stop hook for clean rollout.
- Migration Job in pre-install/upgrade.
- Backups (DB).
Read this next
If you want my full FastAPI Helm chart, it’s at rajpoot.dev .
Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .