Collaborative editing combines real-time networking, conflict resolution, and persistence — three hard problems at once. This post is the working design.
The core problem
Two users edit the same document simultaneously. The system must:
- Show each user’s changes to the other in <100ms.
- Resolve conflicts automatically (no “your version vs theirs” prompts).
- Maintain a consistent state across all clients eventually.
- Persist durably.
CRDTs (Conflict-free Replicated Data Types)
The modern default. CRDTs are data structures that converge to the same state given the same set of operations, regardless of order.
For text: Yjs (production-ready) or Automerge (newer, also strong).
import * as Y from "yjs";
const doc = new Y.Doc();
const text = doc.getText("body");
text.insert(0, "Hello world");
// Apply changes from another client (peer or server)
Y.applyUpdate(doc, updateBytes);
CRDTs have a unique identifier per character/operation; insertions don’t conflict because each is uniquely placed. The data type does the conflict resolution.
OT (Operational Transformation)
Google Docs’s classic approach. A central server transforms incoming operations relative to operations it has seen.
Client A inserts "X" at position 5.
Client B inserts "Y" at position 5.
Server applies A first; then transforms B's op to "insert Y at position 6".
Both clients converge.
Powerful, complex, requires a central authority. CRDTs replace OT for most modern apps.
Persistence
Operations → append to log → snapshot every 1000 ops → archive
CREATE TABLE doc_ops (
doc_id UUID,
op_id BIGINT,
op_data BYTEA, -- serialized CRDT update
user_id BIGINT,
created_at TIMESTAMPTZ,
PRIMARY KEY (doc_id, op_id)
);
CREATE TABLE doc_snapshots (
doc_id UUID,
op_id BIGINT, -- snapshot up to this op
state BYTEA, -- serialized CRDT state
created_at TIMESTAMPTZ,
PRIMARY KEY (doc_id, op_id)
);
Loading: latest snapshot + operations since it.
Real-time delivery
Each document has a server-side coordinator that:
- Holds active connections (WebSockets).
- Receives ops from clients.
- Applies to in-memory state.
- Broadcasts to other connected clients.
- Persists to log.
For Cloudflare Durable Objects , one Durable Object per doc maps perfectly. For self-hosted: WebSocket fleet sharded by doc_id.
Presence and cursors
Beyond the document state, share ephemeral state:
- Who’s currently online in this doc.
- Each user’s cursor position.
- Each user’s selection range.
Presence is not persisted. Only broadcast in real-time. CRDT awareness protocols (Yjs’s awareness API) handle this.
Operational considerations
- Offline edits: client logs ops locally; syncs when online. CRDTs handle late-arriving ops correctly.
- Garbage collection: tombstones (deleted ops) accumulate. Periodically GC; new clients receive a compact state.
- Document forks: branching workflows on top of CRDT state.
- Permissions: enforced server-side; server validates each op against ACL before broadcasting.
- Rate limiting: cap ops per second per user to prevent abuse.
Capacity
For 100M docs × 10 ops/sec/active doc:
- Active docs at any time: ~500k.
- Ops/sec: ~5M peak.
- Per-doc state: a few KB to a few MB.
- Storage: ops × time + snapshots.
WebSocket fleet for real-time; Postgres + S3 for persistence; CDN for static assets.
What ships today
For a small product:
- Yjs as the CRDT.
- WebSocket server (or Durable Objects) for the coordinator.
- Postgres for op log + snapshots.
- S3 for old snapshots / exports.
Several startups have shipped Yjs-based collaborative apps to thousands of users on this stack alone.
Read this next
- Design WhatsApp / Chat — the WebSocket fleet shape carries over.
- Cloudflare Workers + D1 + Durable Objects — Durable Objects per doc.
- SSE vs WebSockets in 2026
- Distributed Systems Fundamentals
If you want a small Yjs + Durable Objects collaborative editor reference, it’s at rajpoot.dev .
Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .