ClickHouse best practices.

DO

  • Batch inserts (1k-100k rows).
  • ORDER BY matches common query filters.
  • LowCardinality for low-cardinality strings.
  • TTL for retention.
  • Materialized views for repeated aggregations.
  • Dictionaries instead of joins for small lookup tables.
  • PARTITION BY month or week, not day.
  • Compression: ZSTD or Codec(Delta, ZSTD) for time series.
  • Skip indexes for non-primary-key filters.

DON’T

  • Don’t insert per-row.
  • Don’t use Nullable everywhere.
  • Don’t update / delete frequently (mutations are async, slow).
  • Don’t join two huge tables (use dict or denorm).
  • Don’t SELECT *.
  • Don’t run OPTIMIZE FINAL frequently.
  • Don’t use ClickHouse for OLTP (it’s OLAP).

Schema design

  • Wide tables: many columns, denormalized.
  • Avoid super-deep nested structures.
  • Use materialized columns for derived fields.

Ingest

  • Insert into local shard, not Distributed (at high rate).
  • Use Buffer engine or async_insert.
  • Format: Native > Parquet > JSONEachRow > Values (for speed).

Queries

  • SELECT specific columns.
  • Use PREWHERE for big filters.
  • Use sample for approximate.
  • Profile with EXPLAIN.

Storage

  • SSD/NVMe.
  • Storage policies (hot/cold tiering).
  • Compress old data more aggressively.

Replication

  • ReplicatedMergeTree always in prod.
  • ClickHouse Keeper > ZooKeeper.
  • Monitor replication lag.

When NOT to use CH

  • Per-row updates / deletes.
  • Strong ACID (banking ledgers).
  • High-concurrency OLTP.
  • Small dataset (< 1M rows; SQLite/Postgres fine).

Alternatives

  • StarRocks (similar).
  • Apache Druid.
  • Apache Pinot.
  • TimescaleDB.

Read this next

If you want my CH best-practices checklist, it’s at rajpoot.dev .


Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .