Is event-driven architecture overhyped?

Often. EDA fits genuine async workflows (notifications, fanout, audit). When forced onto sync request/response paths, it adds latency and complexity for no real benefit. Use where natural; don't force.

Kafka or simpler bus?

Kafka for high throughput, replay, multi-consumer streams. NATS for low-latency pub/sub. Postgres outbox + LISTEN/NOTIFY for small scale. Most apps don't need Kafka; many use it anyway.

Event-Driven Architecture in 2026 — When It's Right and When It's Just Latency

Event-driven architecture (EDA) is the right answer for some problems and a costly distraction for others. In 2026 the patterns are mature; the tax of async-everywhere is well-documented. This post is the honest playbook.

When EDA fits

Decoupled fanout: order placed → email + analytics + inventory + recommendations.
Audit / change capture: every state change is an event; consumers subscribe.
Cross-team boundaries: team A emits events; team B subscribes.
Retries and dead-letters: async with bounded blast radius.
Time-decoupled processing: produce now; consume later.

When it doesn’t

Sync request/response: user expects immediate result; event hop adds latency.
Simple pipelines: where a function call would do.
Small teams: operational tax > benefit.
Low-volume systems: Postgres + cron is simpler.

If your “event” needs an immediate response, it’s not really async; it’s RPC with extra steps.

Outbox pattern

The most important EDA pattern:

BEGIN;
INSERT INTO orders ...;
INSERT INTO outbox (event_type, payload) VALUES ('order.created', '{...}');
COMMIT;

A worker reads from outbox and publishes to your event bus. Atomic local-write + reliable event-emit.

async def publish_outbox():
    while True:
        events = await db.fetch("SELECT * FROM outbox WHERE published_at IS NULL LIMIT 100")
        for e in events:
            try:
                await bus.publish(e.event_type, e.payload)
                await db.execute("UPDATE outbox SET published_at = now() WHERE id = $1", e.id)
            except Exception as ex:
                log.exception("publish failed", id=e.id)

See Saga and Distributed Transactions .

Inbox pattern

Mirror on the consumer side: dedup events you’ve already processed.

CREATE TABLE inbox (
    event_id text PRIMARY KEY,
    processed_at timestamptz DEFAULT now()
);

-- On consume:
INSERT INTO inbox (event_id) VALUES ($1) ON CONFLICT DO NOTHING RETURNING event_id;
-- if returned: new event; process. if not: duplicate; skip.

Idempotent consume. Combined with outbox: events flow exactly-once at the system level even if delivery is at-least-once.

Bus options

	Strengths
Kafka	Throughput, replay, partitioning
NATS / NATS JetStream	Low latency, simple ops
RabbitMQ	Flexible routing, classic queue
Redis Streams	Already have Redis
Postgres LISTEN/NOTIFY	No extra infra; small scale
AWS SNS/SQS	Managed; serverless-friendly
Google Pub/Sub	Managed

For most teams: NATS or Kafka. Pick based on operational comfort.

See Kafka vs NATS vs RabbitMQ .

Event design

{
  "event_id": "evt_abc123",
  "event_type": "order.created",
  "version": 1,
  "occurred_at": "2026-05-05T07:30:00Z",
  "actor": {"type": "user", "id": "user_42"},
  "data": {
    "order_id": "ord_xyz",
    "customer_id": "cust_42",
    "items": [...]
  }
}

Stable schema (versioned).
event_id for dedup.
occurred_at for ordering.
Past tense for the type (order.created, not create.order).
Domain language, not implementation.

Schema evolution

Just like API versioning:

Additive changes: safe.
Removing fields: breaks consumers; never.
Renaming: same as removing + adding.
Versioning: bump version field; consumers handle both.

Schema registry (Confluent, Apicurio) helps enforce.

Replay / reprocessing

Kafka’s killer feature: rewind a consumer to time T; re-process. New consumer? Start from beginning.

For new analytics dashboards / fixes / replays after outage: invaluable.

NATS JetStream supports replay too.

For Postgres outbox: keep events; consumers can re-read.

Dead-letter handling

async def consume(event):
    try:
        await process(event)
    except RetryableError:
        if event.attempts > 5:
            await dead_letter(event, "too many retries")
            return
        raise  # let message broker re-deliver
    except Exception as e:
        await dead_letter(event, str(e))

Bounded retries; dead-letter the rest. Operator can inspect / replay later.

Ordering

Strong global ordering is expensive. Most events don’t need it.

For per-entity ordering: Kafka partition key by entity id. Events for same id go to same partition; consumed in order.

producer.produce(topic, key=str(order_id), value=json.dumps(event))

For cross-entity: usually fine to be unordered.

Idempotency at consumer

async def handle_order_created(event):
    if await already_processed(event["event_id"]):
        return
    
    await db.execute("INSERT INTO inbox (event_id) VALUES ($1)", event["event_id"])
    await create_invoice(event["data"])

Dedup is non-negotiable. At-least-once delivery is the norm.

See Idempotency .

Tracing

Propagate trace context through events:

{
  "event_id": "...",
  "tracecontext": {
    "traceparent": "00-...-...-01"
  },
  "data": {...}
}

Consumer extracts; continues the span. Cross-service traces work even through async hops.

See Distributed Tracing .

Common mistakes

1. EDA for everything

Internal microservice → microservice via Kafka instead of HTTP. Adds latency, complexity. Use HTTP for sync paths.

2. No outbox

Direct publish from app code; transaction commits but event publish fails (or vice versa). Inconsistent state.

3. No idempotency at consumer

Duplicate event → duplicate side effect. Always dedup.

4. Tight schema coupling

Consumer parses event with current shape; producer changes; consumer breaks. Versioning + tolerant readers.

5. No DLQ monitoring

Dead-lettered events accumulate; nobody notices; bugs ship. Alert on DLQ depth.

Anti-patterns

Sync wrapped in async: emit event; wait for response event. That’s RPC with extra steps. Just do RPC.
Event sourcing the whole system: see Event Sourcing . Bound by context.
Every change is an event: most CRUD changes don’t need events. Emit domain events.

What I’d ship today

For genuinely async workflows:

Postgres outbox as the producer guarantee.
NATS JetStream or Kafka as bus.
Inbox pattern at consumers for idempotency.
Versioned events with tolerant readers.
DLQ + alerting.
Tracing across hops.
Sync paths stay sync; don’t EDA-ize them.

Read this next

If you want my outbox + inbox + bus reference (Postgres + NATS), it’s at rajpoot.dev .

Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .

When EDA fits#

When it doesn’t#

Outbox pattern#

Inbox pattern#

Bus options#

Event design#

Schema evolution#

Replay / reprocessing#

Dead-letter handling#

Ordering#

Idempotency at consumer#

Tracing#

Common mistakes#

1. EDA for everything#

2. No outbox#

3. No idempotency at consumer#

4. Tight schema coupling#

5. No DLQ monitoring#

Anti-patterns#

What I’d ship today#

Read this next#