Chat systems trick teams into thinking they’re simple. Then a million-member group, offline delivery, read receipts, end-to-end encryption, and global users meet you. This post is the working design at every scale.
Phase 1: simple chat
CREATE TABLE messages (
id bigserial PRIMARY KEY,
conversation_id bigint NOT NULL,
sender_id bigint NOT NULL,
body text NOT NULL,
created_at timestamptz DEFAULT now()
);
CREATE INDEX ON messages (conversation_id, created_at);
CREATE TABLE conversation_members (
conversation_id bigint,
user_id bigint,
last_read_at timestamptz,
PRIMARY KEY (conversation_id, user_id)
);
Send: insert. Fetch: query by conversation_id. WebSocket pushes new messages. Up to ~100k MAU on a single Postgres.
WebSocket fanout
# Each user connects via WS
connections: dict[int, set[WebSocket]] = defaultdict(set)
async def on_connect(user_id, ws):
connections[user_id].add(ws)
async def send_message(conv_id, msg):
# Persist
await db.execute(...)
# Fanout
members = await db.fetch("SELECT user_id FROM conversation_members WHERE conversation_id = $1", conv_id)
for m in members:
for ws in connections[m["user_id"]]:
await ws.send_json({"type": "msg", "msg": msg})
Single-process. For multi-process: pub/sub (Redis) between processes.
Multi-process fanout
[Client A] ──WS──▶ [App-1]
[Client B] ──WS──▶ [App-2]
│
Both subscribe to Redis pub/sub
# On send
await redis.publish(f"conv:{conv_id}", json.dumps(msg))
# On WS server
async for message in redis.subscribe(f"conv:{conv_id}"):
for ws in conv_connections[conv_id]:
await ws.send_text(message)
Now any app process delivers messages to its locally-connected clients. See Django Channels and FastAPI Streaming .
Phase 2: scale the message store
-- Partition by month
CREATE TABLE messages (...) PARTITION BY RANGE (created_at);
CREATE TABLE messages_2026_05 PARTITION OF messages FOR VALUES FROM ('2026-05-01') TO ('2026-06-01');
See Postgres Partitioning . Drops old partitions; keeps hot partitions tiny.
For really high write rates: ScyllaDB / Cassandra. Discord migrated to ScyllaDB at billion-message scale; before that, MongoDB and Cassandra. Most apps don’t reach this.
Conversation list
SELECT
c.id,
c.last_message_at,
(SELECT body FROM messages WHERE conversation_id = c.id ORDER BY created_at DESC LIMIT 1) AS preview,
cm.last_read_at
FROM conversations c
JOIN conversation_members cm ON cm.conversation_id = c.id AND cm.user_id = $1
ORDER BY c.last_message_at DESC
LIMIT 50;
Common N+1 trap. Either denormalize last_message_id and last_message_at on the conversation row, or use a separate “conversation_summary” view that’s kept up to date.
Read receipts / unread counts
-- Simple
SELECT COUNT(*) FROM messages
WHERE conversation_id = $1 AND created_at > $2;
For high-scale: maintain a counter per (user, conversation) updated on each new message, decremented on mark-read.
Presence
async def on_connect(user_id):
await redis.set(f"presence:{user_id}", "online", ex=60)
asyncio.create_task(heartbeat(user_id))
async def heartbeat(user_id):
while connected:
await redis.expire(f"presence:{user_id}", 60)
await asyncio.sleep(30)
async def on_disconnect(user_id):
await redis.delete(f"presence:{user_id}")
Auto-expires if user crashes. For status to friends: use Redis pub/sub on presence:user:* events.
Offline delivery
CREATE TABLE pending_deliveries (
user_id bigint,
message_id bigint,
created_at timestamptz DEFAULT now(),
PRIMARY KEY (user_id, message_id)
);
If user is offline when message sent: insert to pending_deliveries. On connect: deliver pending. On ack: delete.
Or more simply: client tracks last-seen-message-id; on connect, fetch all messages > that id.
Push notifications
When a user is offline (no WS, no presence) and gets a message: push via APNs / FCM.
async def deliver(user_id, msg):
if user_id in connections:
for ws in connections[user_id]:
await ws.send_json(msg)
else:
await push_notification(user_id, msg)
See Notification System .
Group messages
For groups of N members, fanout is N writes to deliver-states. For a 100k-member group, this is significant.
Options:
- Pull model: clients fetch new messages on connect. Cheap to write; requires polling/connect.
- Push model with capped fanout: push to active users; offline users see messages on connect.
- Topic-based (Discord): clients subscribe to channels; servers fanout per channel.
End-to-end encryption (E2EE)
For Signal-class privacy:
- Client-side keys (X3DH, double ratchet).
- Server stores ciphertext only.
- Search and previews become impossible (or local-only).
Heavy lift. For most chat apps: TLS in transit + at-rest encryption is enough. E2EE only when privacy is the product.
Search
For “find that message about X”:
- Postgres FTS for moderate scale.
- Elasticsearch / Meilisearch for big.
- Per-user search (each user’s index) for E2EE.
See Search System Design .
Common mistakes
1. WebSocket alone, no offline delivery
User goes offline; misses messages; they never appear when they reconnect. Always have a persistent message log + sync on connect.
2. Single-process WS
Three replicas; messages from app-1’s WS users don’t reach app-2’s WS users. Use pub/sub.
3. No pagination on history
SELECT * FROM messages WHERE conversation_id = X returns 1M messages. Use cursor pagination.
4. Polling everything
CPU + DB load explodes. WebSocket for active; HTTP for sync.
5. Sync push notifications
Send-API → push API in foreground. Push API timeout → message slow. Always async.
What I’d ship today
For a new chat app (≤1M MAU):
- Postgres for messages with monthly partitions.
- WebSocket + Redis pub/sub for fanout.
- Presence in Redis.
- Push notifications for offline.
- Cursor pagination for history.
- Postgres FTS for search initially.
- TLS in transit + at-rest encryption; not E2EE unless required.
- Last-message denormalization on conversation row.
Read this next
- Django Channels & WebSockets 2026
- FastAPI Streaming 2026
- Notification System Design 2026
- Postgres Partitioning 2026
If you want my chat reference architecture (FastAPI + Postgres + Redis), it’s at rajpoot.dev .
Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .