SRE | Manvendra Rajpoot

SLOs and Error Budgets in 2026 — The Discipline That Replaces 'Nines'

Practical SLO design: pick SLIs that matter, set realistic targets, define error budgets, alert on burn rate, and make the budget drive engineering tradeoffs.

Incident Response in 2026 — Runbooks, Postmortems, and the Things That Actually Help

Production incident response: clear roles (IC, comms, ops), runbooks that are actually useful, blameless postmortems, status pages, and how to learn from outages.

On-Call and Runbooks That Save Your Friday Night in 2026

How to run on-call without burning out engineers. Rotation schedules, severity definitions, runbook templates, escalation, follow-the-sun, and the patterns from teams that ship reliable systems.

Chaos Engineering in 2026 — Game Days That Actually Find Bugs

Chaos engineering done right. Game days, failure injection (Chaos Mesh, Gremlin), what to test, the observability needed, and the cultural shifts that make it stick.

Observability 2.0 — SLOs, Wide Events, and the End of Three Pillars

What changed in observability since 2020. Wide events vs three-pillars, SLOs as the unit of conversation, OTel’s role, and how to actually find problems in production.

Incident Response and Blameless Postmortems in 2026

Practical incident response in 2026. Severity levels, IC role, comms cadence, runbooks, blameless postmortems, action item tracking, and the cultural shifts that produce real learning.

Circuit Breakers, Bulkheads, and Backpressure — Resilience Patterns for 2026

The resilience patterns every backend engineer should reach for: circuit breakers, bulkheads, backpressure, deadlines, jittered retries, and the production tradeoffs.

SLOs and Error Budgets for App Developers — SRE Without the Mystique

A short, practical guide to SLOs and error budgets for application developers. Choose the right SLI, pick targets you can actually defend, calculate the budget, and use it to drive feature-velocity vs. reliability tradeoffs.