Observability

Integration Cheatsheet 10 — Observability Stack

Cheatsheet: structlog with trace IDs, OTEL auto-instrumentation, Prometheus, slow-query log.

FastAPI Cheatsheet 10 — Observability: structlog, OTEL, Prometheus

Cheatsheet: structlog with contextvars, OTEL auto-instrumentation, custom spans, Prometheus middleware, /metrics.

FastAPI Textbook Ch. 11 — Observability: Logging, Tracing, and Metrics

Production observability: structlog with contextvars, OTEL tracing, Prometheus metrics, log/trace correlation, and the patterns that pay back.

Kubernetes Cheatsheet 13 — Observability

Cheatsheet: kube-prometheus-stack, Loki, traces, alerts.

Observability Cost Control in 2026 — Cardinality, Sampling, and the Bills That Surprise You

Practical observability cost cuts: cardinality discipline, log sampling, trace tail-sampling, retention tiers, and self-hosting tradeoffs.

Distributed Tracing in 2026 — OpenTelemetry, Trace Context, and What Actually Helps Debugging

Practical distributed tracing: OTEL setup, span design, context propagation across services, head/tail sampling, and the operational realities.

LLM Observability in 2026 — Tracing, Evals, and the Things You Can't Skip

Practical LLM observability: tracing every call, eval harnesses, regression detection, prompt versioning, and how to debug the model in production.

Log Aggregation in 2026 — Loki, ClickHouse, OpenSearch, or Datadog

Picking a log aggregator in 2026: Loki for cheap storage, ClickHouse for query power, OpenSearch for full-text, Datadog when you can pay. Decision matrix and patterns.

Observability 2.0 — SLOs, Wide Events, and the End of Three Pillars

What changed in observability since 2020. Wide events vs three-pillars, SLOs as the unit of conversation, OTel’s role, and how to actually find problems in production.

LLM Observability in 2026 — LangSmith, Langfuse, Helicone, and OpenTelemetry

What to track in LLM apps, the tooling landscape (LangSmith, Langfuse, Helicone, Phoenix), the OTel GenAI conventions, and the metrics-and-traces playbook for production AI.