ClickHouse ops.
system tables
SELECT * FROM system.tables LIMIT 5;
SELECT * FROM system.parts WHERE table = 'events';
SELECT * FROM system.metrics;
SELECT * FROM system.events;
SELECT * FROM system.processes;
SELECT * FROM system.query_log WHERE event_time > now() - INTERVAL 1 HOUR ORDER BY query_duration_ms DESC LIMIT 10;
SELECT * FROM system.merges;
SELECT * FROM system.mutations;
Top slow queries
SELECT
query, query_duration_ms, read_rows, memory_usage
FROM system.query_log
WHERE event_time > now() - INTERVAL 1 DAY
ORDER BY query_duration_ms DESC LIMIT 10;
Active queries
SELECT query_id, query, elapsed, memory_usage
FROM system.processes
ORDER BY elapsed DESC;
KILL QUERY WHERE query_id = '...';
Disk usage
SELECT
database, table,
formatReadableSize(sum(bytes_on_disk)) AS size,
sum(rows) AS rows
FROM system.parts
WHERE active
GROUP BY database, table
ORDER BY sum(bytes_on_disk) DESC;
Replication health
SELECT table, queue_size, absolute_delay
FROM system.replicas;
Prometheus exporter
<prometheus>
<endpoint>/metrics</endpoint>
<port>9363</port>
</prometheus>
Metrics at :9363/metrics.
Grafana dashboards
ClickHouse official: ID 14192, 11906.
Alerts
- replica delay > 60s
- disk > 80%
- merge queue > 100
- query failure rate
Optimize
OPTIMIZE TABLE events FINAL;
Force merge. Heavy.
Detach / attach
ALTER TABLE events DETACH PARTITION 202601;
ALTER TABLE events ATTACH PARTITION 202601;
Common mistakes
- No query_log → can’t analyze post-mortem.
- OPTIMIZE FINAL on huge table (long lock).
- Mutations queue growing → slow.
- Disk full → CH stops accepting writes.
Read this next
If you want my CH ops playbook, it’s at rajpoot.dev .
Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .