Distributed tracing is the difference between guessing and knowing. When a request slows down, you want one view that shows where time was spent across all services. OpenTelemetry made this standard. This post is the working playbook.

What you get

Trace: GET /checkout
├─ http server: 1.2s
│  ├─ db: SELECT user      40ms
│  ├─ http client: GET /inventory  300ms
│  ├─ http client: POST /payment   600ms
│  │  └─ http server: POST /payment 580ms
│  │     └─ stripe.PaymentIntent.create  450ms
│  └─ db: INSERT order      80ms

One span per operation. Time spent visible. Causes obvious. No more “the API is slow somewhere.”

OpenTelemetry setup

# Python
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.instrumentation.fastapi import FastAPIInstrumentor
from opentelemetry.instrumentation.httpx import HTTPXClientInstrumentor
from opentelemetry.instrumentation.asyncpg import AsyncPGInstrumentor

provider = TracerProvider()
provider.add_span_processor(BatchSpanProcessor(OTLPSpanExporter()))
trace.set_tracer_provider(provider)

FastAPIInstrumentor.instrument_app(app)
HTTPXClientInstrumentor().instrument()
AsyncPGInstrumentor().instrument()

Auto-instrumentation covers HTTP / DB / queues out of the box. Custom spans for business operations.

Custom spans

tracer = trace.get_tracer(__name__)

async def checkout(user_id):
    with tracer.start_as_current_span("checkout") as span:
        span.set_attribute("user_id", user_id)
        
        with tracer.start_as_current_span("validate_cart"):
            cart = await load_cart(user_id)
        
        with tracer.start_as_current_span("charge"):
            payment = await charge_user(user_id, cart.total)
            span.set_attribute("payment.id", payment.id)
        
        return payment

Each with is a span. Attributes attach context. Errors auto-recorded.

Context propagation

For tracing to span services, context must travel:

Service A: span starts → traceparent header set on outgoing HTTP
Service B: traceparent header → continues the trace

OTEL handles this automatically for instrumented HTTP/gRPC clients. For custom transports (Kafka, queues): inject and extract manually.

# Inject (sender side)
from opentelemetry.propagate import inject
headers = {}
inject(headers)
producer.send(topic, payload, headers=headers)

# Extract (receiver side)
from opentelemetry.propagate import extract
ctx = extract(headers)
with tracer.start_as_current_span("process_message", context=ctx):
    handle(payload)

Without propagation, you get disconnected per-service traces — much less useful.

Sampling

100% tracing of every request is expensive. Sample.

Head sampling

sampler = TraceIdRatioBased(0.1)  # 10% of traces

Decide at the start. Cheap; can miss interesting events (errors).

Tail sampling (via OTEL Collector)

processors:
  tail_sampling:
    decision_wait: 10s
    policies:
      - name: errors
        type: status_code
        status_code: { status_codes: [ERROR] }
      - name: slow
        type: latency
        latency: { threshold_ms: 1000 }
      - name: random_10pct
        type: probabilistic
        probabilistic: { sampling_percentage: 10 }

Collector buffers spans, decides at end. Keeps all error traces, all slow traces, 10% of normal. Best of both.

Useful attributes

Standard OTEL attributes:

  • http.method, http.status_code, http.url
  • db.system, db.statement, db.name
  • messaging.system, messaging.destination

App-specific:

  • user.id
  • tenant.id
  • feature (e.g., “checkout”, “search”)
  • experiment.variant
span.set_attribute("user.id", str(user_id))
span.set_attribute("feature", "checkout")

Allows querying: “show me checkout traces for tenant X with errors.”

Errors

try:
    result = await operation()
except Exception as e:
    span.record_exception(e)
    span.set_status(Status(StatusCode.ERROR, str(e)))
    raise

Auto-instrumentation does this for HTTP / DB. For business logic, do it explicitly.

Storage

Strengths
Tempo (Grafana)OSS; cheap object-storage backend
JaegerOSS; mature; standalone
Datadog APMSaaS; full APM features
HoneycombSaaS; high-cardinality friendly
New RelicSaaS; APM

For self-host: Tempo + Grafana. For SaaS: Honeycomb is excellent for ad-hoc trace querying.

Cost

Traces are heavy: a typical request emits 5-20 spans, each 1-5 KB. At 10k req/sec: tens of GB/day. Sampling matters.

Aim for:

  • 100% errors and slow traces.
  • 10% normal traces.
  • Adjust sampling based on volume.

Trace + log correlation

import logging
from opentelemetry import trace

class TraceLogFilter(logging.Filter):
    def filter(self, record):
        span = trace.get_current_span()
        ctx = span.get_span_context()
        record.trace_id = format(ctx.trace_id, "032x") if ctx.trace_id else "-"
        record.span_id = format(ctx.span_id, "016x") if ctx.span_id else "-"
        return True

Log lines include trace_id; click from a log to its trace and back. Critical for debugging.

What to trace

  • HTTP requests (in/out).
  • DB queries.
  • Cache lookups (with hit/miss attribute).
  • External API calls.
  • Queue produce / consume.
  • Major business operations (checkout, signup, etc.).

Don’t trace:

  • Tight loops (each iteration becoming a span).
  • Trivial in-process calls.
  • Things you’d never query.

Common mistakes

1. Tracing without context propagation

Each service has its own disconnected trace. Forgot to wire up the headers.

2. Too many spans

A trace with 10000 spans is unreadable. Keep major operations only.

3. No sampling

Tracing everything → trace storage bill bigger than infra bill.

4. PII in attributes

Names, emails, full SQL with values. Redact or hash.

5. No alerting on trace data

Beautiful traces; nobody looks. Alert on p99 latency, error rate per service.

What I’d ship today

For a new service:

  • OTEL SDK + auto-instrumentation.
  • Custom spans for major business operations.
  • OTLP export to OTEL Collector.
  • Tail sampling in collector (errors + slow + 10% random).
  • Tempo or Honeycomb as backend.
  • Trace IDs in logs.
  • Grafana dashboards linking metrics → traces → logs.

Read this next

If you want my OTEL setup for FastAPI / Go / Node, it’s at rajpoot.dev .


Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .