String replacement functions provide powerful tools for manipulating text data. These functions allow you to:

  • Search for patterns or substrings
  • Replace content
  • Standardize formats
  • Clean up inconsistencies
  • Transform text for analysis

ClickHouse function reference

overlay

Replaces part of a string with another string, starting at a specified position.

Syntax:

overlay(s, replace, offset[, length])

Arguments:

  • s (String): The input string.
  • replace (String): The replacement string.
  • offset (Int): The 1-based starting position for the replacement. If negative, it’s counted from the end of s.
  • length (Int, optional): The number of characters to replace. If omitted, it defaults to the length of replace.

Returns:

A String with the specified portion replaced.

Example:

SELECT
	overlay('Crunchy taco shell', 'soft', 9) AS modified_taco,
	overlay('Spicy beef taco', 'chicken', 7, 4) AS protein_swap;

Result:

| modified_taco      | protein_swap      |
|--------------------|-------------------|
| Crunchy soft shell | Spicy chicken taco|

In this example:

  • modified_taco replaces ‘taco’ with ‘soft’ starting at position 9.
  • protein_swap replaces ‘beef’ (4 characters) with ‘chicken’ starting at position 7.

The overlay function is case-sensitive and works with byte positions, not character positions, for multi-byte encodings.

overlayUTF8

Replaces part of a UTF-8 encoded string with another string, starting at a specified position.

Syntax:

overlayUTF8(s, replace, offset[, length])

Arguments:

  • s (String): The input UTF-8 encoded string.
  • replace (String): The replacement string.
  • offset (Int): The 1-based position where the replacement starts. If negative, it’s counted from the end of the string.
  • length (Int, optional): The number of characters to replace. If omitted, it defaults to the length of the replace string.

Returns:

A String with the specified portion replaced.

Example:

SELECT
	overlayUTF8('Spicy jalapeño taco', 'habanero', 7) AS result;

Result:

| result                |
|-----------------------|
| Spicy habanero taco   |

In this example, ‘jalapeño’ is replaced with ‘habanero’ in the original string.

This function assumes the input string contains valid UTF-8 encoded text. If this assumption is violated, the result is undefined and no exception is thrown.

replaceOne

Replaces the first occurrence of a substring in a string.

Syntax:

replaceOne(haystack, pattern, replacement)

Arguments:

  • haystack (String): The original string to perform replacement on.
  • pattern (String): The substring to search for.
  • replacement (String): The string to replace the found substring with.

Returns:

A string with the first occurrence of pattern replaced by replacement.

Example:

SELECT
	replaceOne('Spicy beef taco', 'beef', 'chicken') AS result;

Result:

| result              |
|---------------------|
| Spicy chicken taco  |

In this example, replaceOne replaces the first occurrence of ‘beef’ with ‘chicken’ in the original string.

This function is case-sensitive. If you need case-insensitive replacement, consider using replaceRegexpOne with the (?i) flag.

replaceAll

Replaces all occurrences of a substring in a string with another substring.

Syntax:

replaceAll(haystack, pattern, replacement)

Alias:

  • replace

Arguments:

  • haystack (String): The original string in which replacements will be made.
  • pattern (String): The substring to be replaced.
  • replacement (String): The string to replace the pattern with.

Returns:

A new string with all occurrences of pattern replaced by replacement. [String]

Example:

SELECT
	replaceAll('Taco Tuesday is the best day for tacos!', 'Taco', 'Burrito') AS result;

Result:

| result                                         |
|------------------------------------------------|
| Burrito Tuesday is the best day for tacos!     |

In this example, replaceAll replaces all occurrences of ‘Taco’ with ‘Burrito’ in the given string. Note that it only replaces the exact match ‘Taco’ and not ‘tacos’.

This function is case-sensitive. If you need case-insensitive replacement, you might want to use replaceRegexpAll with an appropriate regular expression.

replaceRegexpOne

Replaces the first occurrence of a substring matching the regular expression pattern with a replacement string.

Syntax:

replaceRegexpOne(haystack, pattern, replacement)

Arguments:

  • haystack (String): The input string to perform the replacement on.
  • pattern (String): The regular expression pattern to match (in re2 syntax).
  • replacement (String): The string to replace the matched substring with.

Returns:

A string with the first occurrence of the pattern replaced.

  • The replacement string can contain substitutions \0-\9:
    • \1-\9 correspond to capturing groups (submatches) in the pattern.
    • \0 corresponds to the entire match.
  • To use a literal \ character in the pattern or replacement, escape it with \\.
  • String literals may require additional escaping.

Example:

SELECT
	replaceRegexpOne(
		'Our signature taco contains beef and cheese',
		'(beef) and (cheese)',
		'juicy \\1 with melted \\2'
	) AS new_description;

Result:

| new_description                                              |
|--------------------------------------------------------------|
| Our signature taco contains juicy beef with melted cheese    |

In this example, we enhance the taco description by adding adjectives to the ingredients using capturing groups and backreferences.

replaceRegexpAll

Replaces all occurrences of a substring matching the regular expression pattern with the replacement string.

Syntax:

replaceRegexpAll(haystack, pattern, replacement)

Alias:

  • REGEXP_REPLACE

Arguments:

  • haystack (String): The input string to perform replacements on.
  • pattern (String): The regular expression pattern to match (in re2 syntax).
  • replacement (String): The string to replace matched substrings with.

Returns:

A String with all occurrences of the pattern replaced.

  • The replacement string can contain substitutions \0-\9:
    • \0 corresponds to the entire match
    • \1-\9 correspond to captured groups (submatches)
  • To use a literal backslash in the pattern or replacement, escape it with another backslash: \\
  • String literals may require additional escaping

Example:

SELECT
	replaceRegexpAll('Crunchy taco, soft taco, supreme taco',
					 '([a-z]+) taco',
					 'delicious \1 taco') AS taco_upgrade;

Result:

| taco_upgrade                                                         |
|----------------------------------------------------------------------|
| Crunchy delicious taco, soft delicious taco, supreme delicious taco  |

In this example, we upgrade all our tacos to be delicious by adding the word before each occurrence of “taco”.

If a regular expression matches an empty substring, the replacement is not made more than once. For example:

SELECT
	replaceRegexpAll('Taco Tuesday!', '^', 'Terrific ') AS res;

Result:

| res                    |
|------------------------|
| Terrific Taco Tuesday! |

The prefix is added only once at the beginning, not before every character.

regexpQuoteMeta

Escapes special characters in a string for use in regular expressions.

Syntax:

regexpQuoteMeta(s)

Arguments:

  • s (String): The input string to be escaped.

Returns:

A string with special regex characters escaped. [String]

Description:

This function adds a backslash before the following characters that have special meaning in regular expressions: \0, \, |, (, ), ^, $, ., [, ], ?, *, +, {, :, -

It differs slightly from re2::RE2::QuoteMeta:

  • It escapes the zero byte as \0 instead of \x00
  • It only escapes the required characters

Example:

SELECT
	regexpQuoteMeta('Spicy jalapeño + habanero (50% off)!') AS escaped_taco_special;

Result:

| escaped_taco_special                         |
|----------------------------------------------|
| Spicy jalapeño \+ habanero \(50\% off\)\!    |

In this example, regexpQuoteMeta escapes the special regex characters +, (, ), %, and ! in the taco special description, making it safe to use in a regular expression pattern.

format

Formats a string using specified pattern and arguments, similar to Python’s string formatting.

Syntax:

format(pattern, arg1, arg2, ...)

Arguments:

  • pattern (String): A string containing replacement fields surrounded by curly braces {}.
  • arg1, arg2, ... (Any data type): Values to be inserted into the pattern.

Returns:

A formatted string. [String]

Details:

  • Replacement fields in the pattern can be:
    • Numbered (starting from zero): {0}, {1}, etc.
    • Empty (implicitly numbered): {}, {}, etc.
  • Literal braces can be escaped using double braces: {{ and }}.
  • Anything not in braces is treated as literal text.

Example:

SELECT
	format('{} ordered {} {} tacos', 'Juan', 5, 'spicy') AS order_summary;

Result:

| order_summary               |
|-----------------------------|
| Juan ordered 5 spicy tacos  |

In this example, the format function creates a sentence describing a taco order by inserting the provided arguments into the pattern string.

This function is useful for creating dynamic strings or formatting output in a flexible way.

translate

Replaces characters in a string using a one-to-one character mapping defined by from and to strings.

Syntax:

translate(s, from, to)

Arguments:

  • s (String): The input string to be transformed.
  • from (String): A constant ASCII string defining the characters to be replaced.
  • to (String): A constant ASCII string defining the replacement characters.

Returns:

A transformed string with characters replaced according to the mapping. [String]

  • from and to must be constant ASCII strings of the same length.
  • Non-ASCII characters in the original string are not modified.

Example:

SELECT
	translate('Spicy Taco Tuesday!', 'aic', 'AIK') AS result;

Result:

| result                |
|-----------------------|
| SpIKy TAKo TuesdAy!   |

In this example:

  • ‘a’ is replaced with ‘A'
  • 'i’ is replaced with ‘I'
  • 'c’ is replaced with ‘K’

This function is useful for simple character-based transformations, such as changing letter case or replacing specific characters in strings.

translateUTF8

Replaces characters in a UTF-8 encoded string using a one-to-one character mapping defined by from and to strings.

Syntax:

translateUTF8(s, from, to)

Arguments:

  • s (String): The input UTF-8 encoded string.
  • from (String): A UTF-8 encoded string specifying characters to be replaced.
  • to (String): A UTF-8 encoded string specifying replacement characters.

Returns:

A String with characters replaced according to the mapping.

Example:

SELECT
  translateUTF8('Jalapeño Taco', 'ñ', 'n') AS spicy_taco;

Result:

| spicy_taco    |
|---------------|
| Jalapeno Taco |

In this example, the Spanish letter ‘ñ’ is replaced with ‘n’ in the taco name.

  • from and to must be UTF-8 encoded strings of the same length.
  • Characters in the original string that don’t appear in from are not modified.
  • If a character appears multiple times in from, only the first occurrence is used.

This function is particularly useful for normalizing text, removing diacritics, or performing simple character substitutions in multilingual contexts, such as menu translations or ingredient lists for international taco recipes.

printf

Formats a string with specified values, similar to the C++ printf function.

Syntax:

printf(format, arg1, arg2, ...)

Arguments:

  • format (String): A format string containing format specifiers.
  • arg1, arg2, ...: Values to be formatted (strings, integers, floating-point numbers, etc.).

Returns:

A formatted string.

Format Specifiers:

  • Format specifiers start with % and are followed by a type character (e.g., %s for string, %d for integer).
  • %% represents a literal % character.

Example:

SELECT
	printf(
    '%s ordered %d %s tacos with extra %s',
    'Juan', 3, 'al pastor', 'pineapple'
  );

Result:

| printf('%s ordered %d %s tacos with extra %s', 'Juan', 3, 'al pastor', 'pineapple') |
|-------------------------------------------------------------------------------------|
| Juan ordered 3 al pastor tacos with extra pineapple                                 |

In this example, we format a string describing a taco order:

  • %s is replaced with string arguments (‘Juan’, ‘al pastor’, ‘pineapple’)
  • %d is replaced with the integer argument (3)

Anything not contained in a format specifier is treated as literal text and copied verbatim to the output.