ClickHouse cluster.

Architecture

  • Shards: horizontal partition.
  • Replicas: copies per shard (via ZK / Keeper).
shard1: [r1, r2]
shard2: [r1, r2]
shard3: [r1, r2]

Cluster config

<remote_servers>
  <my_cluster>
    <shard>
      <replica><host>n1</host></replica>
      <replica><host>n2</host></replica>
    </shard>
    <shard>
      <replica><host>n3</host></replica>
      <replica><host>n4</host></replica>
    </shard>
  </my_cluster>
</remote_servers>

Replicated table

CREATE TABLE events ON CLUSTER my_cluster (
    ...
) ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/events', '{replica}')
ORDER BY (...)
PARTITION BY ...;

Uses Keeper to coordinate.

Distributed table

CREATE TABLE events_dist ON CLUSTER my_cluster AS events
ENGINE = Distributed(my_cluster, db, events, rand());

Queries fanout to shards.

Insert

INSERT INTO events_dist VALUES (...);   -- distributes
INSERT INTO events VALUES (...);         -- local only

For high-volume: insert directly to local shards (via app-side routing).

Sharding key

ENGINE = Distributed(cluster, db, table, hash(user_id))
  • rand(): even distribution.
  • cityHash64(user_id): deterministic by user.

Better deterministic for join-locality.

ClickHouse Keeper

Replacement for ZooKeeper. Lighter, built-in.

<keeper_server>
  ...
</keeper_server>

SELECT execution

Distributed: each shard executes locally + merges on initiator.

Use _shard_num for diagnostics:

SELECT _shard_num, count() FROM events_dist GROUP BY _shard_num;

Skew

If shard key uneven → hot shard. Re-shard.

Common mistakes

  • Distributed table without replicated underlying.
  • Shard key on monotonic value → one shard.
  • Inserting into Distributed at high QPS (forwarding overhead).
  • Missing ON CLUSTER → only one node has DDL.

Read this next

If you want my CH cluster setup, it’s at rajpoot.dev .


Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .