ClickHouse string functions.

Basics

length(s)
lower(s) / upper(s) / lowerUTF8(s)
substring(s, 1, 5)
substringIndex('a.b.c', '.', 2)   -- 'a.b'
concat(a, b, c)
trim(s) / trimLeft(s) / trimRight(s)
reverse(s)
position(s, 'sub')
startsWith(s, 'pre') / endsWith(s, 'suf')
like(s, 'a%')
ilike(s, 'a%')

Replace

replaceAll(s, 'old', 'new')
replaceOne(s, 'old', 'new')
replaceRegexpAll(s, 'pattern', 'replacement')
replaceRegexpOne(s, 'pattern', 'replacement')

Regex

match(s, 'pattern')
multiMatchAny(s, ['p1', 'p2'])
extract(s, '(\\d+)')
extractAll(s, '(\\d+)')
extractAllGroupsHorizontal(s, '...')

Split / join

splitByChar('.', 'a.b.c')          -- ['a','b','c']
splitByString(',', 'a,b,c')
splitByRegexp('\\s+', 'a b c')
arrayStringConcat(['a','b','c'], '.')

Format

format('Hello {}', ['World'])
printf('%d-%d', [1, 2])

Date parsing

parseDateTime('2026-01-15 12:00:00', '%Y-%m-%d %H:%M:%S')
parseDateTimeBestEffort('2026-01-15T12:00:00Z')

URL

domain(url)
path(url)
queryString(url)
extractURLParameter(url, 'utm_source')

Hash

xxHash64(s)
sipHash64(s)
cityHash64(s)
murmurHash3_64(s)
SHA1(s) / SHA256(s) / MD5(s)

Base64

base64Encode(s)
base64Decode(s)

Encoding

toString(x)
toInt64(s)
toFloat64(s)
CAST(s AS UInt64)

Common mistakes

  • LIKE with leading % (slow).
  • Regex on huge dataset without bloom index.
  • Multi-byte char string ops without UTF8 functions.

Read this next

If you want my CH function references, they’re at rajpoot.dev .


Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .