Collaborative editing combines real-time networking, conflict resolution, and persistence — three hard problems at once. This post is the working design.

The core problem

Two users edit the same document simultaneously. The system must:

  1. Show each user’s changes to the other in <100ms.
  2. Resolve conflicts automatically (no “your version vs theirs” prompts).
  3. Maintain a consistent state across all clients eventually.
  4. Persist durably.

CRDTs (Conflict-free Replicated Data Types)

The modern default. CRDTs are data structures that converge to the same state given the same set of operations, regardless of order.

For text: Yjs (production-ready) or Automerge (newer, also strong).

import * as Y from "yjs";

const doc = new Y.Doc();
const text = doc.getText("body");

text.insert(0, "Hello world");

// Apply changes from another client (peer or server)
Y.applyUpdate(doc, updateBytes);

CRDTs have a unique identifier per character/operation; insertions don’t conflict because each is uniquely placed. The data type does the conflict resolution.

OT (Operational Transformation)

Google Docs’s classic approach. A central server transforms incoming operations relative to operations it has seen.

Client A inserts "X" at position 5.
Client B inserts "Y" at position 5.
Server applies A first; then transforms B's op to "insert Y at position 6".
Both clients converge.

Powerful, complex, requires a central authority. CRDTs replace OT for most modern apps.

Persistence

Operations → append to log → snapshot every 1000 ops → archive
CREATE TABLE doc_ops (
    doc_id UUID,
    op_id BIGINT,
    op_data BYTEA,                -- serialized CRDT update
    user_id BIGINT,
    created_at TIMESTAMPTZ,
    PRIMARY KEY (doc_id, op_id)
);

CREATE TABLE doc_snapshots (
    doc_id UUID,
    op_id BIGINT,                 -- snapshot up to this op
    state BYTEA,                  -- serialized CRDT state
    created_at TIMESTAMPTZ,
    PRIMARY KEY (doc_id, op_id)
);

Loading: latest snapshot + operations since it.

Real-time delivery

Each document has a server-side coordinator that:

  • Holds active connections (WebSockets).
  • Receives ops from clients.
  • Applies to in-memory state.
  • Broadcasts to other connected clients.
  • Persists to log.

For Cloudflare Durable Objects , one Durable Object per doc maps perfectly. For self-hosted: WebSocket fleet sharded by doc_id.

Presence and cursors

Beyond the document state, share ephemeral state:

  • Who’s currently online in this doc.
  • Each user’s cursor position.
  • Each user’s selection range.

Presence is not persisted. Only broadcast in real-time. CRDT awareness protocols (Yjs’s awareness API) handle this.

Operational considerations

  • Offline edits: client logs ops locally; syncs when online. CRDTs handle late-arriving ops correctly.
  • Garbage collection: tombstones (deleted ops) accumulate. Periodically GC; new clients receive a compact state.
  • Document forks: branching workflows on top of CRDT state.
  • Permissions: enforced server-side; server validates each op against ACL before broadcasting.
  • Rate limiting: cap ops per second per user to prevent abuse.

Capacity

For 100M docs × 10 ops/sec/active doc:

  • Active docs at any time: ~500k.
  • Ops/sec: ~5M peak.
  • Per-doc state: a few KB to a few MB.
  • Storage: ops × time + snapshots.

WebSocket fleet for real-time; Postgres + S3 for persistence; CDN for static assets.

What ships today

For a small product:

  • Yjs as the CRDT.
  • WebSocket server (or Durable Objects) for the coordinator.
  • Postgres for op log + snapshots.
  • S3 for old snapshots / exports.

Several startups have shipped Yjs-based collaborative apps to thousands of users on this stack alone.

Read this next

If you want a small Yjs + Durable Objects collaborative editor reference, it’s at rajpoot.dev .


Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .