System design interviews test how you reason about distributed systems under uncertainty. The framework is well-known; what separates levels is how you wield it. This post is the working playbook.
The framework
1. Clarify (5 min) — what are we building?
2. Estimate (5 min) — capacity numbers
3. API + data model — interfaces and storage
4. High-level architecture — boxes and arrows
5. Deep dive — bottlenecks; mitigation; tradeoffs
6. Scale up — what breaks at 10× / 100×
7. Wrap — recap, where you'd invest next
45-60 minutes. Cover the surface in 25; deep-dive what the interviewer cares about.
Clarify
Q: "Design Twitter."
You: "Posting + feed + follow? Public posts only? 1M users or 100M?"
Pin down:
- Functional: what features?
- Non-functional: scale, latency, availability, consistency.
- Out of scope: explicit defer.
Under-clarification leads to designing for the wrong problem.
Capacity estimates
500M MAU. 50% DAU = 250M.
Each posts 0.5/day, reads 50/day.
Posts/sec: 250M * 0.5 / 86400 ≈ 1500/sec.
Reads/sec: 250M * 50 / 86400 ≈ 145k/sec.
Per post: 200 bytes avg.
Storage: 1500 * 200 * 86400 * 365 ≈ 10 TB/year.
Don’t memorize. Reason.
Round generously. Show you understand orders of magnitude. Exact numbers don’t matter; “thousands” vs “millions” vs “billions” matters.
API design
Sketch the key endpoints:
POST /tweets { body }
GET /users/{id}/feed
POST /follow { user_id }
GET /tweets/{id}
Include:
- Auth model: JWT? Session?
- Pagination: cursor-based for feeds.
- Rate limits: who bears what.
Data model
users (id, handle, name, ...)
tweets (id, author_id, body, created_at, indexed by (author_id, created_at))
follows (follower_id, followed_id)
inboxes (user_id, tweet_id, created_at) -- for fanout
Mention indexes. Mention partition keys. Mention which DB technology fits.
Architecture
[Client] → [LB] → [API tier] ──→ [Postgres (primary)]
──→ [Cache (Redis)]
──→ [Async fanout to inboxes]
──→ [Search (Elastic)]
──→ [Object storage (S3) for media]
Boxes and arrows. Make data-flow obvious. Identify hot paths.
Deep dive
Pick one or two areas the interviewer is interested in. Common deep-dive prompts:
- Hot keys (celebrity fanout).
- Read amplification.
- Cache invalidation.
- Database sharding strategy.
- Failure modes.
This is where seniors shine: tradeoffs and reasoning.
Scaling story
Walk through scale increases:
1k users: single Postgres + app.
100k users: + cache + read replicas.
1M users: + sharded cache + CDN.
100M users: + fanout async + partitioned tables + BFF.
Show you understand how systems evolve. What breaks first; what mitigates it.
AI-era considerations
Modern interviews include AI components:
- LLM-powered search / recommendations.
- Agent workflows.
- RAG over user content.
Apply standard patterns:
- Async generation; cache results.
- Routing (cheap model first).
- Eval pipelines.
- Rate limits per LLM call cost.
See LLM Cost Optimization and LLM Routing .
What senior interviewers want
Less:
- Memorized “Twitter architecture.”
- Buzzword density.
- One right answer.
More:
- Tradeoff reasoning (“X is faster but Y is cheaper; we’d pick X if…”).
- Honest capacity math.
- Specifics (“Postgres + pgvector for X reasons”).
- Awareness of what you don’t know (“I’d benchmark this; I’d start here and iterate”).
- Operational concerns (failure modes, rollback, observability).
Common mistakes
1. Skipping clarification
Spend 30 min designing the wrong thing.
2. Architecture astronaut
Microservices, Kafka, Kubernetes, service mesh — for a 1k user app. Match scale to need.
3. Premature optimization
“We’ll need 10 shards” before establishing whether 1 DB suffices.
4. No tradeoffs
“This is the best.” Senior interviewers wait for “but” — when does this break?
5. Cargo-cult
“They use Cassandra at Netflix” — applied to a 100k MAU app.
Practice patterns
Common prompts that share patterns:
- Twitter / Instagram — fanout, feed, ranking.
- WhatsApp / Slack — chat, presence, offline delivery.
- Uber / DoorDash — geospatial, matching.
- Stripe — payments, idempotency, webhooks.
- YouTube — video pipeline, CDN, transcoding.
- Dropbox — sync, conflict resolution.
- URL shortener — caching, write throughput.
- Rate limiter — algorithms, distributed state.
Each: which patterns apply? Which break? What scales differently?
See:
- Design a Chat System
- Design a Feed System
- Design a Payment System
- Design a Search System
- Design a Rate Limiter
Communication
- Talk while you draw. Silent thinking confuses interviewers.
- Number the steps. “First, let’s establish capacity; second, the API; third…”
- Park tangents. “I’ll come back to caching after we agree on data model.”
- Ask for direction. “Should I deep-dive sharding or failure modes?”
The interviewer’s grading the conversation, not the diagram.
What I’d practice
For a senior+ interview:
- Two whiteboard sessions / week for a month before.
- Build something real in the weeks before — refresh muscle memory.
- Read postmortems — Github, Cloudflare, AWS write great ones.
- Read the system you’d cite before citing.
- Mock interview with a senior+ peer.
Read this next
- Design a Chat System 2026
- Design a Feed System 2026
- Distributed Systems Fundamentals
- Design a Rate Limiter 2026
If you want my system-design interview prep checklist, it’s at rajpoot.dev .
Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .