Logging is the difference between “we’ll figure it out from production behavior” and “I can’t debug this.” Most teams half-do it. This post is the working set for Python in 2026.

Why structured logging

Unstructured:

INFO:checkout:Processing order 1234 for user 567 amount $99.50

Structured:

{"level":"info","logger":"checkout","msg":"order_processed","order_id":1234,"user_id":567,"amount_cents":9950}

The structured form is queryable. “Show all orders > $100 from user 567 yesterday” is a Loki / Datadog filter, not a regex.

structlog setup

import structlog
import logging
import sys

logging.basicConfig(format="%(message)s", stream=sys.stdout, level=logging.INFO)

structlog.configure(
    processors=[
        structlog.contextvars.merge_contextvars,
        structlog.processors.add_log_level,
        structlog.processors.TimeStamper(fmt="iso"),
        structlog.processors.StackInfoRenderer(),
        structlog.processors.format_exc_info,
        structlog.processors.JSONRenderer() if PROD else structlog.dev.ConsoleRenderer(),
    ],
    wrapper_class=structlog.make_filtering_bound_logger(logging.INFO),
    cache_logger_on_first_use=True,
)

log = structlog.get_logger()

JSON in prod; pretty colorful console in dev.

Logging context

log.info("order_created", order_id=order.id, amount=order.amount, user_id=order.user_id)

Each call: an event name + key-value attrs. Search by event name; filter by attrs.

Correlation via contextvars

import structlog
import uuid

# Middleware
async def request_id_middleware(request, call_next):
    rid = request.headers.get("x-request-id") or str(uuid.uuid4())
    structlog.contextvars.bind_contextvars(request_id=rid, path=request.url.path)
    try:
        return await call_next(request)
    finally:
        structlog.contextvars.clear_contextvars()

Now every log line in this request automatically carries request_id and path. No manual passing.

# Anywhere downstream, no plumbing
log.info("db_query_slow", duration_ms=520)
# JSON output includes request_id automatically

Trace correlation

If using OTEL:

from opentelemetry import trace

def add_trace_ids(_, __, event_dict):
    span = trace.get_current_span()
    ctx = span.get_span_context()
    if ctx.trace_id:
        event_dict["trace_id"] = format(ctx.trace_id, "032x")
        event_dict["span_id"] = format(ctx.span_id, "016x")
    return event_dict

# Add to processors list

Logs and traces correlated. Click from log → trace.

Log levels

Use for
DEBUGVerbose dev info; off in prod
INFOSignificant events; “what happened”
WARNINGRecoverable issues; degraded behavior
ERRORErrors that affected this request/op
CRITICALSystem-wide problems

Most apps over-use INFO and under-use DEBUG. Be deliberate.

Errors with context

try:
    process_order(order_id)
except Exception as e:
    log.exception("order_processing_failed", order_id=order_id)
    raise

log.exception includes traceback + event. Don’t swallow exceptions silently.

PII redaction

import re

EMAIL = re.compile(r"[\w\.-]+@[\w\.-]+")
PHONE = re.compile(r"\+?\d[\d\s\-\(\)]{7,}")

def redact(_, __, event_dict):
    for k, v in list(event_dict.items()):
        if isinstance(v, str):
            v = EMAIL.sub("[EMAIL]", v)
            v = PHONE.sub("[PHONE]", v)
            event_dict[k] = v
    return event_dict

Add as a processor before JSONRenderer. Cheap; catches accidental PII in log messages.

For per-field redaction:

log.info("user_signup", user_id=u.id, email_hash=hash_email(u.email))
# Don't log raw email

Log volume

Production logs add up:

  • DEBUG off in prod.
  • Sample noisy events: if random.random() < 0.01: log.info(...).
  • Aggregate: a counter metric instead of one-log-per-event for high-volume signals.
  • Rotate / retain intelligently (30d hot, 90d cold).

For Loki / Grafana : cost-by-cardinality. Watch attribute cardinality.

Logger per module

log = structlog.get_logger(__name__)

Now logs include logger=mymodule. Filter / route by logger name.

Async-safe

structlog + contextvars is asyncio-safe. Each task gets its own context.

async with anyio.create_task_group() as tg:
    for u in users:
        tg.start_soon(handle_user, u)
        # each task has its own bound context

Common mistakes

1. f-string log messages

log.info(f"Processing order {order.id}")  # not searchable as structured event

vs

log.info("processing_order", order_id=order.id)

The latter is queryable; the former isn’t.

2. Inconsistent event names

order_processed, OrderProcessed, order processed. Pick a convention (snake_case is common).

3. PII in logs

Email, phone, full names. Even if your team is small now, GDPR doesn’t care about size.

4. Logging in tight loops

Every iteration → 100k log lines per request. Sample or aggregate.

5. No timestamp

Logs without timestamps are useless when correlating. Always include.

What I’d ship today

For new Python apps:

  • structlog with JSON in prod, console in dev.
  • contextvars for request_id, user_id, trace_id.
  • OTEL trace correlation if tracing.
  • PII redaction processor.
  • Loki + Grafana for storage / query (or Datadog).
  • Alert on error log rate spikes.
  • Documentation of standard event names.

Read this next

If you want my structlog + FastAPI + OTEL starter, it’s at rajpoot.dev .


Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .