ClickHouse basics.
Why ClickHouse
Columnar OLAP DB. Aggregations on billions of rows in milliseconds. Great for analytics, logs, metrics.
Install / connect
docker run -d --name ch -p 8123:8123 -p 9000:9000 clickhouse/clickhouse-server
clickhouse-client # native TCP
clickhouse-client --query "SELECT 1"
curl 'http://localhost:8123/?query=SELECT+1' # HTTP
CLI
clickhouse-client --host=localhost --user=default
clickhouse-client --multiquery --query "..."
In client: \q quit, multiline queries end with ;.
CREATE TABLE
CREATE TABLE events (
ts DateTime,
user_id UInt32,
event String,
properties String
) ENGINE = MergeTree()
ORDER BY (user_id, ts)
PARTITION BY toYYYYMM(ts);
MergeTree = main engine. ORDER BY = primary key (sort).
Insert
INSERT INTO events VALUES (now(), 1, 'click', '{}');
INSERT INTO events SELECT ... FROM src;
INSERT INTO events FORMAT CSV ...
Bulk inserts much faster than per-row.
Query
SELECT * FROM events LIMIT 10;
SELECT count() FROM events;
SELECT event, count() FROM events GROUP BY event ORDER BY 2 DESC;
SELECT toDate(ts) AS day, count() FROM events GROUP BY day ORDER BY day;
Common types
UInt8/16/32/64 Int8/16/32/64
Float32/64 Decimal(P,S)
String FixedString(N)
Date Date32 DateTime DateTime64(3) IPv4 IPv6 UUID
LowCardinality(String) -- for low-cardinality strings (fast)
Array(T)
Nullable(T)
Tuple(T1, T2)
Map(K, V)
Enum8('a'=1, 'b'=2)
LowCardinality
country LowCardinality(String)
Dictionary-encoded. Use for columns with < millions of unique values.
DESCRIBE / SHOW
SHOW DATABASES;
SHOW TABLES;
DESCRIBE TABLE events;
SHOW CREATE TABLE events;
DROP / ALTER
DROP TABLE events;
ALTER TABLE events ADD COLUMN region String DEFAULT '';
ALTER TABLE events MODIFY COLUMN x UInt64;
ALTER TABLE events RENAME COLUMN old TO new;
INSERT performance
- Batch inserts (100k+ rows per insert).
- Don’t insert one row at a time.
- Use Buffer engine or insert via Kafka/etc to batch.
Sample queries
-- Top events
SELECT event, count() AS c
FROM events
WHERE ts >= now() - INTERVAL 1 HOUR
GROUP BY event ORDER BY c DESC LIMIT 10;
-- Time-series
SELECT toStartOfMinute(ts) AS m, count()
FROM events GROUP BY m ORDER BY m;
Common mistakes
- Per-row INSERT → slow.
- Wrong ORDER BY → poor compression.
- Using String for low-cardinality (use LowCardinality).
- Nullable when not needed (overhead).
- Tiny PARTITION BY (one per day on big table) → too many parts.
Read this next
If you want my CH starter, it’s at rajpoot.dev .
Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .