ClickHouse integrations.

Python

pip install clickhouse-connect
import clickhouse_connect

client = clickhouse_connect.get_client(host='localhost', user='default')

# Query
r = client.query("SELECT * FROM events LIMIT 10")
for row in r.result_rows: print(row)

# DataFrame
df = client.query_df("SELECT toDate(ts) AS day, count() FROM events GROUP BY day")

# Insert
client.insert("events", [
    ['2026-01-15 12:00:00', 1, 'click'],
    ['2026-01-15 12:01:00', 2, 'view'],
], column_names=['ts', 'user_id', 'event'])

Async

import aiochclient

async with aiochclient.ChClient(session, url="http://localhost:8123") as client:
    async for row in client.iterate("SELECT * FROM events"):
        ...

Node

npm i @clickhouse/client
import { createClient } from "@clickhouse/client";

const client = createClient({ host: "http://localhost:8123" });

const rows = await client.query({
    query: "SELECT * FROM events LIMIT 10",
    format: "JSONEachRow",
});

for await (const row of rows.stream()) {
    console.log(row.text);
}

await client.insert({
    table: "events",
    values: [{ ts: '...', user_id: 1, event: 'click' }],
    format: "JSONEachRow",
});

Grafana

ClickHouse plugin available. Native datasource. Time-series + ad-hoc dashboards.

Tableau / Power BI / Metabase

Native ClickHouse connector. Use as warehouse.

dbt

pip install dbt-clickhouse
# profiles.yml
my_project:
  target: dev
  outputs:
    dev:
      type: clickhouse
      host: localhost
      port: 8123
      user: default
      schema: myapp
-- models/daily_users.sql
{{ config(materialized='table', engine='MergeTree() ORDER BY day') }}

SELECT toDate(ts) AS day, count(DISTINCT user_id) AS dau
FROM events GROUP BY day

Kafka → ClickHouse

Native via Kafka engine (see ingest cheatsheet). Or use ClickHouse Kafka connector / Debezium / Kafka Connect.

Postgres → ClickHouse

  • PeerDB / Airbyte / Materialize.
  • Or read directly: PostgreSQL table function.
SELECT * FROM postgresql('host:5432', 'db', 'table', 'user', 'pass');

Spark

spark.read
    .format("jdbc")
    .option("url", "jdbc:clickhouse://localhost:8123/")
    .option("dbtable", "events")
    .load()

Pandas

import pandas as pd
df = client.query_df("SELECT ...")

Common mistakes

  • Per-row INSERT from driver (use bulk).
  • Forgetting compression in HTTP (?compress=1).
  • Polling for changes instead of streaming.
  • Direct connection from public web (use API layer).

Read this next

If you want my CH client wrappers, they’re at rajpoot.dev .


Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .