ClickHouse integrations.
Python
pip install clickhouse-connect
import clickhouse_connect
client = clickhouse_connect.get_client(host='localhost', user='default')
# Query
r = client.query("SELECT * FROM events LIMIT 10")
for row in r.result_rows: print(row)
# DataFrame
df = client.query_df("SELECT toDate(ts) AS day, count() FROM events GROUP BY day")
# Insert
client.insert("events", [
['2026-01-15 12:00:00', 1, 'click'],
['2026-01-15 12:01:00', 2, 'view'],
], column_names=['ts', 'user_id', 'event'])
Async
import aiochclient
async with aiochclient.ChClient(session, url="http://localhost:8123") as client:
async for row in client.iterate("SELECT * FROM events"):
...
Node
npm i @clickhouse/client
import { createClient } from "@clickhouse/client";
const client = createClient({ host: "http://localhost:8123" });
const rows = await client.query({
query: "SELECT * FROM events LIMIT 10",
format: "JSONEachRow",
});
for await (const row of rows.stream()) {
console.log(row.text);
}
await client.insert({
table: "events",
values: [{ ts: '...', user_id: 1, event: 'click' }],
format: "JSONEachRow",
});
Grafana
ClickHouse plugin available. Native datasource. Time-series + ad-hoc dashboards.
Tableau / Power BI / Metabase
Native ClickHouse connector. Use as warehouse.
dbt
pip install dbt-clickhouse
# profiles.yml
my_project:
target: dev
outputs:
dev:
type: clickhouse
host: localhost
port: 8123
user: default
schema: myapp
-- models/daily_users.sql
{{ config(materialized='table', engine='MergeTree() ORDER BY day') }}
SELECT toDate(ts) AS day, count(DISTINCT user_id) AS dau
FROM events GROUP BY day
Kafka → ClickHouse
Native via Kafka engine (see ingest cheatsheet). Or use ClickHouse Kafka connector / Debezium / Kafka Connect.
Postgres → ClickHouse
- PeerDB / Airbyte / Materialize.
- Or read directly: PostgreSQL table function.
SELECT * FROM postgresql('host:5432', 'db', 'table', 'user', 'pass');
Spark
spark.read
.format("jdbc")
.option("url", "jdbc:clickhouse://localhost:8123/")
.option("dbtable", "events")
.load()
Pandas
import pandas as pd
df = client.query_df("SELECT ...")
Common mistakes
- Per-row INSERT from driver (use bulk).
- Forgetting compression in HTTP (
?compress=1). - Polling for changes instead of streaming.
- Direct connection from public web (use API layer).
Read this next
If you want my CH client wrappers, they’re at rajpoot.dev .
Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .