Cheatsheet for health endpoints + lifespan.

Liveness vs readiness

ProbeQuestionFailing →
LivenessIs the process alive?Restart pod
ReadinessCan it serve traffic?Remove from service
StartupHas it finished starting?Delay other probes

Liveness

@app.get("/healthz", include_in_schema=False)
async def health():
    return {"status": "ok"}

Cheap; never depends on external systems.

Readiness (with deps)

from sqlalchemy import text

@app.get("/ready", include_in_schema=False)
async def ready(db: AsyncSession = Depends(get_db)):
    try:
        await asyncio.wait_for(db.execute(text("SELECT 1")), timeout=2)
    except Exception as e:
        return JSONResponse({"status": "not ready", "err": str(e)}, status_code=503)
    return {"status": "ready"}

Deep health (admin)

@app.get("/_/health", include_in_schema=False)
async def deep(db = Depends(get_db), redis = Depends(get_redis), _: User = Depends(get_admin)):
    out = {}
    out["db"] = await safe(check_db, db)
    out["redis"] = await safe(check_redis, redis)
    out["external"] = await safe(check_external)
    code = 200 if all(v["ok"] for v in out.values()) else 503
    return JSONResponse({"checks": out}, status_code=code)

async def safe(fn, *args):
    try:
        await fn(*args)
        return {"ok": True}
    except Exception as e:
        return {"ok": False, "err": str(e)[:200]}

Lifespan (resources)

from contextlib import asynccontextmanager

@asynccontextmanager
async def lifespan(app: FastAPI):
    # startup
    app.state.engine = create_async_engine(URL, pool_size=20, pool_pre_ping=True, pool_recycle=300)
    app.state.sm = async_sessionmaker(app.state.engine, expire_on_commit=False)
    app.state.redis = await Redis.from_url(REDIS_URL)
    app.state.http = httpx.AsyncClient(timeout=10)
    log.info("startup_done")
    yield
    # shutdown
    log.info("shutdown_start")
    await app.state.http.aclose()
    await app.state.redis.aclose()
    await app.state.engine.dispose()
    log.info("shutdown_done")

app = FastAPI(lifespan=lifespan)

Lifespan errors

@asynccontextmanager
async def lifespan(app):
    try:
        # startup checks
        await check_db_connection()
    except Exception as e:
        log.error("startup_failed", err=str(e))
        raise        # uvicorn exits non-zero
    yield

Fail fast.

Multiple lifespans

def combine_lifespans(*lifespans):
    @asynccontextmanager
    async def combined(app):
        async with AsyncExitStack() as stack:
            for ls in lifespans:
                await stack.enter_async_context(ls(app))
            yield
    return combined

app = FastAPI(lifespan=combine_lifespans(db_lifespan, redis_lifespan, telemetry_lifespan))

Pre-stop hook (K8s)

lifecycle:
  preStop:
    exec:
      command: ["sh", "-c", "sleep 10"]   # let LB drain

Plus terminationGracePeriodSeconds: 60.

Graceful Uvicorn shutdown

uvicorn ... --timeout-graceful-shutdown 30

Uvicorn:

  1. Stops accepting new connections.
  2. Waits up to 30s for in-flight to finish.
  3. Cancels stragglers.
  4. Runs lifespan shutdown.

Connection draining

For long requests / streams:

# In handler
async def gen():
    while not shutdown_event.is_set() and not await request.is_disconnected():
        yield f"data: ...\n\n"
        await asyncio.sleep(1)

Set shutdown_event from a signal handler if you need finer control.

Read this next

If you want my K8s probes + lifespan starter, it’s at rajpoot.dev .


Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .