Production ClickHouse.

Config

/etc/clickhouse-server/config.xml:

<listen_host>0.0.0.0</listen_host>
<http_port>8123</http_port>
<tcp_port>9000</tcp_port>
<interserver_http_port>9009</interserver_http_port>

<max_concurrent_queries>500</max_concurrent_queries>
<max_server_memory_usage_to_ram_ratio>0.9</max_server_memory_usage_to_ram_ratio>

<merge_tree>
  <parts_to_delay_insert>500</parts_to_delay_insert>
  <parts_to_throw_insert>3000</parts_to_throw_insert>
</merge_tree>

<query_log>
  <database>system</database>
  <table>query_log</table>
  <partition_by>toYYYYMM(event_date)</partition_by>
  <ttl>event_date + INTERVAL 7 DAY DELETE</ttl>
</query_log>

Users / passwords

/etc/clickhouse-server/users.xml:

<default>
  <password_sha256_hex>...</password_sha256_hex>
  <networks><ip>::/0</ip></networks>
  <profile>default</profile>
  <quota>default</quota>
</default>

Or via SQL:

CREATE USER alice IDENTIFIED BY 'password' SETTINGS PROFILE 'default';
GRANT SELECT, INSERT ON db.* TO alice;

TLS

<https_port>8443</https_port>
<openSSL>
  <server>
    <certificateFile>/etc/clickhouse-server/server.crt</certificateFile>
    <privateKeyFile>/etc/clickhouse-server/server.key</privateKeyFile>
  </server>
</openSSL>

ClickHouse Keeper

Replace ZK:

<keeper_server>
  <tcp_port>9181</tcp_port>
  <server_id>1</server_id>
  <raft_configuration>
    <server><id>1</id><hostname>kn1</hostname><port>9234</port></server>
    <server><id>2</id><hostname>kn2</hostname><port>9234</port></server>
    <server><id>3</id><hostname>kn3</hostname><port>9234</port></server>
  </raft_configuration>
</keeper_server>

Cluster

<remote_servers>
  <my_cluster>
    <shard>
      <internal_replication>true</internal_replication>
      <replica><host>n1</host></replica>
      <replica><host>n2</host></replica>
    </shard>
    <shard>
      <internal_replication>true</internal_replication>
      <replica><host>n3</host></replica>
      <replica><host>n4</host></replica>
    </shard>
  </my_cluster>
</remote_servers>

ReplicatedMergeTree on all tables

CREATE TABLE events ON CLUSTER my_cluster (...)
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/events', '{replica}')
ORDER BY ...;

Backups

BACKUP DATABASE myapp TO Disk('s3_backup', 'myapp-{date}');

Or clickhouse-backup tool. Schedule daily.

Monitoring

  • Prometheus exporter built-in (<prometheus> block).
  • Grafana dashboards: 14192, 11906.
  • system.query_log analysis.
  • Alerts: replica lag, disk, errors.

Disk policy

<storage_configuration>
  <disks>
    <hot><path>/var/lib/clickhouse/</path></hot>
    <cold><type>s3</type><endpoint>https://...</endpoint></cold>
  </disks>
  <policies>
    <tiered>
      <volumes>
        <hot><disk>hot</disk></hot>
        <cold><disk>cold</disk></cold>
      </volumes>
    </tiered>
  </policies>
</storage_configuration>
CREATE TABLE ... SETTINGS storage_policy = 'tiered'
TTL ts + INTERVAL 30 DAY TO VOLUME 'cold';

Kernel

vm.swappiness = 1
vm.max_map_count = 262144
fs.file-max = 2097152

Capacity

  • RAM: working set + cache + headroom.
  • Disk: 4-5x cold data size for merges.
  • Network: 10Gbps preferred between shards.

Pre-launch

  • Replicated tables.
  • Keeper / ZK cluster (3+ nodes).
  • Backups + tested.
  • Monitoring + alerts.
  • TLS.
  • Auth + RBAC.
  • Query logs enabled with TTL.
  • Storage policy / TTL retention.
  • Capacity planned.

Read this next

That’s 20 ClickHouse cheatsheets. Next category: SQLite.

If you want my full CH prod setup, it’s at rajpoot.dev .


Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .