ClickHouse best practices.
DO
- Batch inserts (1k-100k rows).
- ORDER BY matches common query filters.
- LowCardinality for low-cardinality strings.
- TTL for retention.
- Materialized views for repeated aggregations.
- Dictionaries instead of joins for small lookup tables.
- PARTITION BY month or week, not day.
- Compression: ZSTD or Codec(Delta, ZSTD) for time series.
- Skip indexes for non-primary-key filters.
DON’T
- Don’t insert per-row.
- Don’t use Nullable everywhere.
- Don’t update / delete frequently (mutations are async, slow).
- Don’t join two huge tables (use dict or denorm).
- Don’t SELECT *.
- Don’t run OPTIMIZE FINAL frequently.
- Don’t use ClickHouse for OLTP (it’s OLAP).
Schema design
- Wide tables: many columns, denormalized.
- Avoid super-deep nested structures.
- Use materialized columns for derived fields.
Ingest
- Insert into local shard, not Distributed (at high rate).
- Use Buffer engine or async_insert.
- Format: Native > Parquet > JSONEachRow > Values (for speed).
Queries
- SELECT specific columns.
- Use PREWHERE for big filters.
- Use sample for approximate.
- Profile with EXPLAIN.
Storage
- SSD/NVMe.
- Storage policies (hot/cold tiering).
- Compress old data more aggressively.
Replication
- ReplicatedMergeTree always in prod.
- ClickHouse Keeper > ZooKeeper.
- Monitor replication lag.
When NOT to use CH
- Per-row updates / deletes.
- Strong ACID (banking ledgers).
- High-concurrency OLTP.
- Small dataset (< 1M rows; SQLite/Postgres fine).
Alternatives
- StarRocks (similar).
- Apache Druid.
- Apache Pinot.
- TimescaleDB.
Read this next
If you want my CH best-practices checklist, it’s at rajpoot.dev .
Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .