- Search for patterns or substrings
- Replace content
- Standardize formats
- Clean up inconsistencies
- Transform text for analysis
ClickHouse function reference
overlay
Replaces part of a string with another string, starting at a specified position. Syntax:s(String): The input string.replace(String): The replacement string.offset(Int): The 1-based starting position for the replacement. If negative, it’s counted from the end ofs.length(Int, optional): The number of characters to replace. If omitted, it defaults to the length ofreplace.
String with the specified portion replaced.
Example:
modified_tacoreplaces ‘taco’ with ‘soft’ starting at position 9.protein_swapreplaces ‘beef’ (4 characters) with ‘chicken’ starting at position 7.
The
overlay function is case-sensitive and works with byte positions, not character positions, for multi-byte encodings.overlayUTF8
Replaces part of a UTF-8 encoded string with another string, starting at a specified position. Syntax:s(String): The input UTF-8 encoded string.replace(String): The replacement string.offset(Int): The 1-based position where the replacement starts. If negative, it’s counted from the end of the string.length(Int, optional): The number of characters to replace. If omitted, it defaults to the length of thereplacestring.
String with the specified portion replaced.
Example:
This function assumes the input string contains valid UTF-8 encoded text. If this assumption is violated, the result is undefined and no exception is thrown.
replaceOne
Replaces the first occurrence of a substring in a string. Syntax:haystack(String): The original string to perform replacement on.pattern(String): The substring to search for.replacement(String): The string to replace the found substring with.
pattern replaced by replacement.
Example:
replaceOne replaces the first occurrence of ‘beef’ with ‘chicken’ in the original string.
This function is case-sensitive. If you need case-insensitive replacement, consider using
replaceRegexpOne with the (?i) flag.replaceAll
Replaces all occurrences of a substring in a string with another substring. Syntax:- replace
haystack(String): The original string in which replacements will be made.pattern(String): The substring to be replaced.replacement(String): The string to replace the pattern with.
pattern replaced by replacement. [String]
Example:
replaceAll replaces all occurrences of ‘Taco’ with ‘Burrito’ in the given string. Note that it only replaces the exact match ‘Taco’ and not ‘tacos’.
This function is case-sensitive. If you need case-insensitive replacement, you might want to use
replaceRegexpAll with an appropriate regular expression.replaceRegexpOne
Replaces the first occurrence of a substring matching the regular expression pattern with a replacement string. Syntax:haystack(String): The input string to perform the replacement on.pattern(String): The regular expression pattern to match (in re2 syntax).replacement(String): The string to replace the matched substring with.
- The
replacementstring can contain substitutions\0-\9:\1-\9correspond to capturing groups (submatches) in the pattern.\0corresponds to the entire match.
- To use a literal
\character in the pattern or replacement, escape it with\\. - String literals may require additional escaping.
replaceRegexpAll
Replaces all occurrences of a substring matching the regular expression pattern with the replacement string. Syntax:- REGEXP_REPLACE
haystack(String): The input string to perform replacements on.pattern(String): The regular expression pattern to match (in re2 syntax).replacement(String): The string to replace matched substrings with.
String with all occurrences of the pattern replaced.
- The
replacementstring can contain substitutions\0-\9:\0corresponds to the entire match\1-\9correspond to captured groups (submatches)
- To use a literal backslash in the pattern or replacement, escape it with another backslash:
\\ - String literals may require additional escaping
If a regular expression matches an empty substring, the replacement is not made more than once. For example:Result:The prefix is added only once at the beginning, not before every character.
regexpQuoteMeta
Escapes special characters in a string for use in regular expressions. Syntax:s(String): The input string to be escaped.
String]
Description:
This function adds a backslash before the following characters that have special meaning in regular expressions:
\0, \, |, (, ), ^, $, ., [, ], ?, *, +, {, :, -
It differs slightly from
re2::RE2::QuoteMeta:- It escapes the zero byte as
\0instead of\x00 - It only escapes the required characters
regexpQuoteMeta escapes the special regex characters +, (, ), %, and ! in the taco special description, making it safe to use in a regular expression pattern.
format
Formats a string using specified pattern and arguments, similar to Python’s string formatting. Syntax:pattern(String): A string containing replacement fields surrounded by curly braces{}.arg1, arg2, ...(Any data type): Values to be inserted into the pattern.
String]
Details:
- Replacement fields in the pattern can be:
- Numbered (starting from zero):
{0},{1}, etc. - Empty (implicitly numbered):
{},{}, etc.
- Numbered (starting from zero):
- Literal braces can be escaped using double braces:
{{and}}. - Anything not in braces is treated as literal text.
format function creates a sentence describing a taco order by inserting the provided arguments into the pattern string.
This function is useful for creating dynamic strings or formatting output in a flexible way.
translate
Replaces characters in a string using a one-to-one character mapping defined byfrom and to strings.
Syntax:
s(String): The input string to be transformed.from(String): A constant ASCII string defining the characters to be replaced.to(String): A constant ASCII string defining the replacement characters.
String]
fromandtomust be constant ASCII strings of the same length.- Non-ASCII characters in the original string are not modified.
- ‘a’ is replaced with ‘A’
- ‘i’ is replaced with ‘I’
- ‘c’ is replaced with ‘K’
translateUTF8
Replaces characters in a UTF-8 encoded string using a one-to-one character mapping defined byfrom and to strings.
Syntax:
s(String): The input UTF-8 encoded string.from(String): A UTF-8 encoded string specifying characters to be replaced.to(String): A UTF-8 encoded string specifying replacement characters.
String with characters replaced according to the mapping.
Example:
fromandtomust be UTF-8 encoded strings of the same length.- Characters in the original string that don’t appear in
fromare not modified. - If a character appears multiple times in
from, only the first occurrence is used.
printf
Formats a string with specified values, similar to the C++ printf function. Syntax:format(String): A format string containing format specifiers.arg1, arg2, ...: Values to be formatted (strings, integers, floating-point numbers, etc.).
- Format specifiers start with
%and are followed by a type character (e.g.,%sfor string,%dfor integer). %%represents a literal%character.
%sis replaced with string arguments (‘Juan’, ‘al pastor’, ‘pineapple’)%dis replaced with the integer argument (3)
Anything not contained in a format specifier is treated as literal text and copied verbatim to the output.