System design interviews test how you reason about distributed systems under uncertainty. The framework is well-known; what separates levels is how you wield it. This post is the working playbook.

The framework

1. Clarify (5 min)        — what are we building?
2. Estimate (5 min)       — capacity numbers
3. API + data model       — interfaces and storage
4. High-level architecture — boxes and arrows
5. Deep dive               — bottlenecks; mitigation; tradeoffs
6. Scale up                — what breaks at 10× / 100×
7. Wrap                    — recap, where you'd invest next

45-60 minutes. Cover the surface in 25; deep-dive what the interviewer cares about.

Clarify

Q: "Design Twitter."
You: "Posting + feed + follow? Public posts only? 1M users or 100M?"

Pin down:

  • Functional: what features?
  • Non-functional: scale, latency, availability, consistency.
  • Out of scope: explicit defer.

Under-clarification leads to designing for the wrong problem.

Capacity estimates

500M MAU. 50% DAU = 250M.
Each posts 0.5/day, reads 50/day.
Posts/sec: 250M * 0.5 / 86400 ≈ 1500/sec.
Reads/sec: 250M * 50 / 86400 ≈ 145k/sec.
Per post: 200 bytes avg.
Storage: 1500 * 200 * 86400 * 365 ≈ 10 TB/year.

Don’t memorize. Reason.

Round generously. Show you understand orders of magnitude. Exact numbers don’t matter; “thousands” vs “millions” vs “billions” matters.

API design

Sketch the key endpoints:

POST /tweets             { body }
GET  /users/{id}/feed
POST /follow             { user_id }
GET  /tweets/{id}

Include:

  • Auth model: JWT? Session?
  • Pagination: cursor-based for feeds.
  • Rate limits: who bears what.

Data model

users          (id, handle, name, ...)
tweets         (id, author_id, body, created_at, indexed by (author_id, created_at))
follows        (follower_id, followed_id)
inboxes        (user_id, tweet_id, created_at)  -- for fanout

Mention indexes. Mention partition keys. Mention which DB technology fits.

Architecture

[Client] → [LB] → [API tier] ──→ [Postgres (primary)]
                            ──→ [Cache (Redis)]
                            ──→ [Async fanout to inboxes]
                            ──→ [Search (Elastic)]
                            ──→ [Object storage (S3) for media]

Boxes and arrows. Make data-flow obvious. Identify hot paths.

Deep dive

Pick one or two areas the interviewer is interested in. Common deep-dive prompts:

  • Hot keys (celebrity fanout).
  • Read amplification.
  • Cache invalidation.
  • Database sharding strategy.
  • Failure modes.

This is where seniors shine: tradeoffs and reasoning.

Scaling story

Walk through scale increases:

1k users:    single Postgres + app.
100k users:  + cache + read replicas.
1M users:    + sharded cache + CDN.
100M users:  + fanout async + partitioned tables + BFF.

Show you understand how systems evolve. What breaks first; what mitigates it.

AI-era considerations

Modern interviews include AI components:

  • LLM-powered search / recommendations.
  • Agent workflows.
  • RAG over user content.

Apply standard patterns:

  • Async generation; cache results.
  • Routing (cheap model first).
  • Eval pipelines.
  • Rate limits per LLM call cost.

See LLM Cost Optimization and LLM Routing .

What senior interviewers want

Less:

  • Memorized “Twitter architecture.”
  • Buzzword density.
  • One right answer.

More:

  • Tradeoff reasoning (“X is faster but Y is cheaper; we’d pick X if…”).
  • Honest capacity math.
  • Specifics (“Postgres + pgvector for X reasons”).
  • Awareness of what you don’t know (“I’d benchmark this; I’d start here and iterate”).
  • Operational concerns (failure modes, rollback, observability).

Common mistakes

1. Skipping clarification

Spend 30 min designing the wrong thing.

2. Architecture astronaut

Microservices, Kafka, Kubernetes, service mesh — for a 1k user app. Match scale to need.

3. Premature optimization

“We’ll need 10 shards” before establishing whether 1 DB suffices.

4. No tradeoffs

“This is the best.” Senior interviewers wait for “but” — when does this break?

5. Cargo-cult

“They use Cassandra at Netflix” — applied to a 100k MAU app.

Practice patterns

Common prompts that share patterns:

  • Twitter / Instagram — fanout, feed, ranking.
  • WhatsApp / Slack — chat, presence, offline delivery.
  • Uber / DoorDash — geospatial, matching.
  • Stripe — payments, idempotency, webhooks.
  • YouTube — video pipeline, CDN, transcoding.
  • Dropbox — sync, conflict resolution.
  • URL shortener — caching, write throughput.
  • Rate limiter — algorithms, distributed state.

Each: which patterns apply? Which break? What scales differently?

See:

Communication

  • Talk while you draw. Silent thinking confuses interviewers.
  • Number the steps. “First, let’s establish capacity; second, the API; third…”
  • Park tangents. “I’ll come back to caching after we agree on data model.”
  • Ask for direction. “Should I deep-dive sharding or failure modes?”

The interviewer’s grading the conversation, not the diagram.

What I’d practice

For a senior+ interview:

  1. Two whiteboard sessions / week for a month before.
  2. Build something real in the weeks before — refresh muscle memory.
  3. Read postmortems — Github, Cloudflare, AWS write great ones.
  4. Read the system you’d cite before citing.
  5. Mock interview with a senior+ peer.

Read this next

If you want my system-design interview prep checklist, it’s at rajpoot.dev .


Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .