Cheatsheet for deploying the stack.

Dockerfile

# syntax=docker/dockerfile:1.7
FROM python:3.13-slim AS builder
WORKDIR /app
RUN pip install uv
COPY pyproject.toml uv.lock ./
RUN --mount=type=cache,target=/root/.cache/uv \
    uv sync --frozen --no-dev

FROM python:3.13-slim
WORKDIR /app
COPY --from=builder /app/.venv ./.venv
COPY src ./src
COPY alembic.ini ./
COPY migrations ./migrations
ENV PATH=/app/.venv/bin:$PATH
USER 1000:1000
EXPOSE 8000
CMD ["uvicorn", "src.myapp.main:app", "--host", "0.0.0.0", "--port", "8000"]

Includes migrations so the same image can run them.

Migration as K8s Job

apiVersion: batch/v1
kind: Job
metadata:
  name: migrate-{{ .Values.image.tag }}
  annotations:
    "helm.sh/hook": pre-install,pre-upgrade
    "helm.sh/hook-weight": "0"
    "helm.sh/hook-delete-policy": before-hook-creation
spec:
  template:
    spec:
      containers:
        - name: migrate
          image: ghcr.io/me/myapp:{{ .Values.image.tag }}
          command: ["alembic", "upgrade", "head"]
          envFrom:
            - secretRef: { name: myapp-secrets }
            - configMapRef: { name: myapp-config }
      restartPolicy: OnFailure
  backoffLimit: 3
  ttlSecondsAfterFinished: 300

Runs before app pods rollout.

Deployment

apiVersion: apps/v1
kind: Deployment
metadata: { name: myapp }
spec:
  replicas: 3
  selector: { matchLabels: { app: myapp } }
  template:
    metadata: { labels: { app: myapp } }
    spec:
      terminationGracePeriodSeconds: 60
      containers:
        - name: api
          image: ghcr.io/me/myapp:{{ .Values.image.tag }}
          ports: [{ containerPort: 8000 }]
          envFrom:
            - secretRef: { name: myapp-secrets }
            - configMapRef: { name: myapp-config }
          resources:
            requests: { cpu: "200m", memory: "256Mi" }
            limits: { memory: "512Mi" }
          startupProbe:
            httpGet: { path: /healthz, port: 8000 }
            periodSeconds: 5
            failureThreshold: 30
          readinessProbe:
            httpGet: { path: /ready, port: 8000 }
            periodSeconds: 5
          livenessProbe:
            httpGet: { path: /healthz, port: 8000 }
            periodSeconds: 10
          lifecycle:
            preStop:
              exec: { command: ["sh", "-c", "sleep 10"] }

Service

apiVersion: v1
kind: Service
metadata: { name: myapp }
spec:
  selector: { app: myapp }
  ports: [{ port: 80, targetPort: 8000 }]

Secrets

apiVersion: v1
kind: Secret
metadata: { name: myapp-secrets }
stringData:
  MYAPP_DATABASE_URL: postgresql+asyncpg://...
  MYAPP_SECRET_KEY: ...

Or use External Secrets Operator + Vault / AWS Secrets Manager.

ConfigMap

apiVersion: v1
kind: ConfigMap
metadata: { name: myapp-config }
data:
  MYAPP_ENV: prod
  MYAPP_LOG_LEVEL: INFO

HPA

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata: { name: myapp }
spec:
  scaleTargetRef: { apiVersion: apps/v1, kind: Deployment, name: myapp }
  minReplicas: 3
  maxReplicas: 30
  metrics:
    - type: Resource
      resource: { name: cpu, target: { type: Utilization, averageUtilization: 70 } }

Lifespan

@asynccontextmanager
async def lifespan(app: FastAPI):
    s = get_settings()
    app.state.engine = create_async_engine(s.database_url, ...)
    app.state.sm = async_sessionmaker(app.state.engine, expire_on_commit=False)
    yield
    await app.state.engine.dispose()

SIGTERM → ASGI server stops accepting → existing requests finish → lifespan shutdown → exit.

Graceful shutdown gotchas

  • terminationGracePeriodSeconds must exceed lifecycle preStop sleep + longest request.
  • preStop sleep 10 allows kube-proxy to stop routing.

Health endpoints

@app.get("/healthz", include_in_schema=False)
async def healthz(): return {"status": "ok"}

@app.get("/ready", include_in_schema=False)
async def ready(db: AsyncSession = Depends(get_db)):
    try:
        await db.execute(text("SELECT 1"))
    except Exception:
        return JSONResponse({"status": "not ready"}, status_code=503)
    return {"status": "ready"}

Liveness: process up. Readiness: DB reachable.

Single uvicorn worker per container

CMD ["uvicorn", "src.myapp.main:app", "--host", "0.0.0.0", "--port", "8000"]

K8s scales horizontally (replicas). Don’t --workers 4 inside a container; K8s manages scale.

TLS

Terminate at ingress (ALB / nginx / Cloudflare). App speaks HTTP internally.

Behind a proxy

# main.py
from fastapi.middleware.trustedhost import TrustedHostMiddleware
app.add_middleware(TrustedHostMiddleware, allowed_hosts=["myapp.com", "*.myapp.com"])

Or via uvicorn --proxy-headers --forwarded-allow-ips='*' to read X-Forwarded-*.

DB pool sizing

engine = create_async_engine(
    url, pool_size=20, max_overflow=10,
    pool_pre_ping=True, pool_recycle=3600,
)
total_connections = replicas × (pool_size + max_overflow)
                  ≤ db_max_connections (with margin for admin + replication)

PgBouncer in transaction-pool for higher multiplexing.

Common mistakes

  • Migration Job not in pre-install,pre-upgrade hook — runs after app pods (which fail).
  • terminationGracePeriodSeconds too short — kills in-flight requests.
  • Missing pool_pre_ping — stale conns after DB restart.
  • No PodDisruptionBudget — voluntary disruptions kill all replicas.
  • Plain HTTP secrets in YAML — use sealed-secrets / ESO.

Read this next

If you want my Helm chart for the whole stack, it’s at rajpoot.dev .


Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .