A “feed” is one of the deceptively complex pieces of consumer apps. Twitter / Instagram / TikTok have written extensively about theirs; the patterns generalize. This post is the working design.
Phase 1: pull (compute on read)
SELECT * FROM posts
WHERE author_id IN (SELECT followed_id FROM follows WHERE follower_id = $1)
ORDER BY created_at DESC
LIMIT 20;
Simple; fresh. Scales to ~10k followers per user before it gets slow.
Phase 2: push (fanout-on-write)
When a user posts, write to every follower’s inbox:
async def post(author_id, body):
post_id = await db.execute("INSERT INTO posts ...")
followers = await db.fetch("SELECT follower_id FROM follows WHERE followed_id = $1", author_id)
# Write into each follower's inbox
await db.executemany(
"INSERT INTO inbox (user_id, post_id, created_at) VALUES ($1, $2, $3)",
[(f["follower_id"], post_id, now()) for f in followers]
)
Reads are now O(1) — query my inbox.
SELECT p.* FROM inbox i JOIN posts p ON p.id = i.post_id
WHERE i.user_id = $1 ORDER BY i.created_at DESC LIMIT 20;
The celebrity problem
Push fails for high-follower accounts:
- Justin Bieber posts → write 100M inbox rows.
- Each post: 100M writes. Storage explodes. Latency spikes.
Solution: hybrid.
Hybrid fanout
For each user with < 10k followers: push (fanout-on-write).
For each user with >= 10k followers: pull (compute-on-read).
async def post(author_id, body):
post_id = await db.execute("INSERT INTO posts ...")
follower_count = await db.fetchval(
"SELECT count(*) FROM follows WHERE followed_id = $1", author_id
)
if follower_count < 10000:
# Push
await fanout_to_inboxes(author_id, post_id)
# else: pull on read
async def feed(user_id):
# Inbox part (small accounts you follow)
pushed = await db.fetch("SELECT ... FROM inbox WHERE user_id = $1 ORDER BY created_at DESC LIMIT 100", user_id)
# Pull part (celebrities you follow)
celebs = await db.fetch(
"SELECT followed_id FROM follows f JOIN users u ON u.id = f.followed_id "
"WHERE f.follower_id = $1 AND u.follower_count >= 10000", user_id
)
pulled = await db.fetch(
"SELECT * FROM posts WHERE author_id = ANY($1) ORDER BY created_at DESC LIMIT 100",
[c["followed_id"] for c in celebs]
)
# Merge + rank
return rank(pushed + pulled, user_id)[:20]
This is the architecture Twitter described publicly years ago.
Cache the feed
After computing, cache the user’s top-N:
key = f"feed:{user_id}"
if cached := await redis.zrange(key, 0, 20, desc=True):
return cached
feed = await compute()
await redis.zadd(key, {p.id: p.score for p in feed})
await redis.expire(key, 300)
return feed
Refresh on user-active or every N seconds. Most users aren’t watching live; cached feed is fine.
Ranking
Beyond chronological:
def score(post, user):
return (
post.engagement_rate * 0.4 +
recency_decay(post.age) * 0.3 +
affinity(user, post.author) * 0.2 +
diversity_penalty(user, post.author) * 0.1
)
Real ranking in 2026 uses neural recommenders, not hand-tuned weights — but the inputs are similar.
For recommenders you’ll need embedding-based candidate retrieval + a learned scorer.
Freshness
For “live” feel:
- Push for real-time delivery to active users.
- Cache TTL of 5–60s for inactive users.
- Background refresh on user activity.
Don’t recompute the feed on every page load — cache aggressively.
Storage
For huge inbox tables (push fanout): partition by user_id ranges. Old entries fall out via retention.
For pull-style: posts table + author index. Smaller; fewer hot rows.
Relevance feedback
async def on_user_action(user_id, post_id, action):
# action: 'view', 'like', 'comment', 'share', 'hide'
await db.execute("INSERT INTO interactions (user_id, post_id, action) VALUES ($1, $2, $3)",
user_id, post_id, action)
Trains the ranker. Required for any non-chronological feed.
Spam / quality
Bad actor’s posts shouldn’t dominate feeds:
- Reputation score per author.
- Demote / suppress flagged content.
- Diversity in top-N (don’t show 5 posts from same author back-to-back).
- Edge case: viral spam → human review queue.
Filters
User-controlled:
- Mute author / topic / keyword.
- Hide post.
- “Less of this” signal.
Apply filters during read or push (cheaper at write time for stable preferences).
Multi-region
Feeds want to be near the user. Cache per region; replicate posts globally.
For globally-trending content: cache invalidation gets harder. Eventually consistent is acceptable for feeds.
Capacity sketch
Twitter-scale numbers (publicly disclosed years ago, scaled for 2026):
- ~500M posts/day → ~6k posts/sec average; 20k peak.
- ~5B reads/day → ~60k reads/sec average; 500k peak.
- Inbox tables: 100s of TB.
For a 1M MAU app: tiny by comparison; Postgres + Redis handles it.
Common mistakes
1. Single approach (push or pull) at scale
Push: celebrities break it. Pull: hot users overwhelm DB.
2. No cache
Recompute the feed every load. CPU and DB hammered.
3. Synchronous fanout
Posting takes 30s while we write 1M inbox rows. Fanout async; respond immediately.
4. No ranking infrastructure
Chronological-only as the product grows; engagement plateaus. Build ranking infra early enough to evolve.
5. Ignoring spam
Spam takes over feeds; users leave. Spam quality is part of the system.
What I’d ship today
For a new social app:
- Pull initially — simplest.
- Push for low-follower users as you grow.
- Hybrid when celebrities exist.
- Redis-cached feeds with short TTLs.
- Ranking signals from day one (even if just weighted heuristics).
- Spam scoring as a layer.
- Per-user filters / mutes.
Read this next
- Caching Strategies 2026
- Distributed Counter Design
- Search System Design 2026
- Notification System Design 2026
If you want my hybrid feed reference (Postgres + Redis + ranker), it’s at rajpoot.dev .
Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .