Observability has moved on from “three pillars: logs, metrics, traces.” The 2026 standard is wide events — every request emits one structured event with rich context — combined with SLOs as the unit of conversation. This post is the working knowledge.

Three pillars vs wide events

Three pillars assumes you know what you’ll ask. Counters for known dimensions; logs for narrative; traces for spans. Each is sampled / pre-aggregated differently. New questions need new instrumentation.

Wide events is a single structured event per request:

{
  "timestamp": "2026-04-30T18:30:00Z",
  "trace_id": "abc",
  "service": "api",
  "endpoint": "/orders",
  "user_id": 42,
  "tenant_id": 7,
  "duration_ms": 124,
  "status": 200,
  "db_query_ms": 80,
  "cache_hit": true,
  "user_agent": "...",
  "feature_flags": ["new-checkout"],
  "error": null
}

One row per request. Persist all of them (or sample at high rate). Query later with WHERE + GROUP BY you didn’t think of at instrumentation time.

Honeycomb pioneered this. ClickHouse + OTel collector lets you do it self-hosted.

SLO-driven observability

The conversation shifts from “what’s slow?” to “are we meeting our SLO?” See SLOs and Error Budgets .

A dashboard that shows:

  • Current SLO compliance.
  • Error budget remaining.
  • What’s burning the budget.
  • Trends over time.

…is more useful than 100 dashboards of arbitrary metrics.

OpenTelemetry’s role

OTel is the wire format. Wide events fit naturally as OTel spans + log records. The collector batches, processes, exports.

receivers:
  otlp:
    protocols: { grpc: {endpoint: 0.0.0.0:4317} }

processors:
  batch:
    timeout: 5s
  attributes/scrub:
    actions:
      - { key: db.statement, action: delete }

exporters:
  otlphttp/honeycomb:
    endpoint: https://api.honeycomb.io
    headers: { x-honeycomb-team: $TEAM }
  clickhouse:
    endpoint: http://clickhouse:9000

For deeper OTel see OpenTelemetry End-to-End in 2026 .

The investigation workflow

Old: page fires → look at metrics → look at logs → look at traces → guess.

New: page fires → SLO burning → query wide events for the slow tail → group by attributes → find the regression.

SELECT user_id, AVG(duration_ms), COUNT(*)
FROM events
WHERE status = 500 AND timestamp > now() - INTERVAL 15 MINUTE
GROUP BY user_id
ORDER BY count DESC

Five minutes from page to diagnosis vs. an hour of dashboard hopping.

Cardinality

The promise of wide events is high cardinality. Every user_id, every tenant_id, every feature flag — all queryable.

The cost: storage. ClickHouse handles billions of rows; metrics backends choke at thousands of distinct labels.

Pick a backend that supports your cardinality.

Read this next

If you want a ClickHouse + OTel + Grafana wide-events stack, it’s at rajpoot.dev .


Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .