Webhooks are the boring infrastructure that powers every integration. Get them wrong and every customer integration is a support ticket. Get them right and they disappear into the background. This post is the working guide.
What “right” looks like
A webhook system that doesn’t ruin anyone’s day:
- Signed so consumers verify origin.
- Retried with backoff over days.
- Idempotent so retries don’t double-process.
- Ordered within a stream (or explicitly unordered).
- Observable (consumers can see delivery history; producers can debug failures).
- Replayable from the API for backfills.
Stripe is the gold standard. Mimic them.
Producer side
Sign every payload
import hmac, hashlib, time
def sign(secret: str, payload: bytes) -> str:
ts = int(time.time())
msg = f"{ts}.".encode() + payload
sig = hmac.new(secret.encode(), msg, hashlib.sha256).hexdigest()
return f"t={ts},v1={sig}"
response_headers["X-Webhook-Signature"] = sign(customer_secret, body)
Consumer verifies signature + checks timestamp is within ~5 minutes of now (replay protection).
Retry with backoff
If consumer returns non-2xx or times out:
- Retry at 1m, 5m, 30m, 1h, 6h, 24h, 48h, 72h.
- Stop at 72h or N attempts.
- Mark as failed; surface in customer dashboard.
@dramatiq.actor(max_retries=8, retry_when=should_retry)
def deliver_webhook(endpoint_id, event_id):
endpoint = db.get_endpoint(endpoint_id)
event = db.get_event(event_id)
body = json.dumps(event).encode()
signature = sign(endpoint.secret, body)
resp = httpx.post(endpoint.url,
headers={"X-Webhook-Signature": signature},
content=body, timeout=10)
resp.raise_for_status()
Idempotency-friendly events
Each event has a stable id. Consumers dedup on id because retries WILL deliver the same event multiple times. See Idempotency, Retries, and Exactly-Once Illusions
.
Replay endpoint
GET /api/events?since=event_id_123&limit=100
If a consumer was offline for 4 days, they re-fetch from the API. Webhooks are best-effort delivery; the API is the source of truth.
Per-endpoint config
Each customer endpoint configures:
- URL.
- Secret.
- Event types they want.
- Active/disabled.
UI for them to test, view delivery history, rotate secrets.
Consumer side
Verify signature
def verify(secret: str, signature_header: str, body: bytes) -> bool:
parts = dict(p.split("=") for p in signature_header.split(","))
ts, sig = int(parts["t"]), parts["v1"]
if abs(time.time() - ts) > 300:
return False # too old; replay attack
expected = hmac.new(secret.encode(),
f"{ts}.".encode() + body,
hashlib.sha256).hexdigest()
return hmac.compare_digest(sig, expected)
@app.post("/webhook")
async def handle(request: Request):
body = await request.body()
if not verify(SECRET, request.headers["X-Webhook-Signature"], body):
return JSONResponse({"error": "invalid"}, 401)
event = json.loads(body)
await process_event(event)
return {"ok": True}
Use compare_digest, not == (timing-attack safe).
Idempotency
async def process_event(event):
seen = await db.fetchval("SELECT 1 FROM processed_events WHERE id = $1", event["id"])
if seen:
return # already done
async with db.transaction():
await actually_process(event)
await db.execute(
"INSERT INTO processed_events (id, ts) VALUES ($1, now())",
event["id"],
)
The processed-events table is your dedup. Never skip this for production webhooks.
Return 2xx fast
A webhook handler should return 2xx in under 5 seconds. If processing takes longer, queue the work and ack:
@app.post("/webhook")
async def handle(request: Request):
body = await request.body()
# verify, parse
await queue.enqueue(process, event)
return {"ok": True} # ack immediately
Long processing in the handler = timeouts = retries = mess.
For the queue side see Background Jobs in Python .
Be tolerant of duplicates and order
Webhooks will arrive out of order. Plan for it:
- Each event has a
versionorcreated_at. Don’t apply older versions on top of newer. - Eventual consistency is OK; race conditions are not.
Producer dashboards
Customers need to debug. Show:
- Event history with timestamps.
- Per-event delivery attempts + status codes.
- Replay button.
- Endpoint test (send a test event).
- Rotating secrets without breaking active integrations (overlap period).
Without these, every integration question becomes a support ticket.
Tools that handle this for you
- Svix — webhooks-as-a-service. Producer side. Signing, retries, dashboards. Pay per webhook.
- Hookdeck — sit between producers and consumers; smooth retries.
- Inngest — workflow + webhook-shaped triggers in one platform.
For a small SaaS, building it is fine. For scale, Svix saves months.
Read this next
- Idempotency, Retries, and Exactly-Once Illusions
- Design a Distributed Task Queue
- Designing REST APIs That Don’t Suck
- Authentication in 2026
If you want a Postgres-backed webhook delivery service template, it’s at rajpoot.dev .
Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .