- Search for patterns or substrings
- Replace content
- Standardize formats
- Clean up inconsistencies
- Transform text for analysis
ClickHouse function reference
overlay
Replaces part of a string with another string, starting at a specified position. Syntax:s
(String
): The input string.replace
(String
): The replacement string.offset
(Int
): The 1-based starting position for the replacement. If negative, it’s counted from the end ofs
.length
(Int
, optional): The number of characters to replace. If omitted, it defaults to the length ofreplace
.
String
with the specified portion replaced.
Example:
modified_taco
replaces ‘taco’ with ‘soft’ starting at position 9.protein_swap
replaces ‘beef’ (4 characters) with ‘chicken’ starting at position 7.
The
overlay
function is case-sensitive and works with byte positions, not character positions, for multi-byte encodings.overlayUTF8
Replaces part of a UTF-8 encoded string with another string, starting at a specified position. Syntax:s
(String
): The input UTF-8 encoded string.replace
(String
): The replacement string.offset
(Int
): The 1-based position where the replacement starts. If negative, it’s counted from the end of the string.length
(Int
, optional): The number of characters to replace. If omitted, it defaults to the length of thereplace
string.
String
with the specified portion replaced.
Example:
This function assumes the input string contains valid UTF-8 encoded text. If this assumption is violated, the result is undefined and no exception is thrown.
replaceOne
Replaces the first occurrence of a substring in a string. Syntax:haystack
(String
): The original string to perform replacement on.pattern
(String
): The substring to search for.replacement
(String
): The string to replace the found substring with.
pattern
replaced by replacement
.
Example:
replaceOne
replaces the first occurrence of ‘beef’ with ‘chicken’ in the original string.
This function is case-sensitive. If you need case-insensitive replacement, consider using
replaceRegexpOne
with the (?i)
flag.replaceAll
Replaces all occurrences of a substring in a string with another substring. Syntax:- replace
haystack
(String
): The original string in which replacements will be made.pattern
(String
): The substring to be replaced.replacement
(String
): The string to replace the pattern with.
pattern
replaced by replacement
. [String
]
Example:
replaceAll
replaces all occurrences of ‘Taco’ with ‘Burrito’ in the given string. Note that it only replaces the exact match ‘Taco’ and not ‘tacos’.
This function is case-sensitive. If you need case-insensitive replacement, you might want to use
replaceRegexpAll
with an appropriate regular expression.replaceRegexpOne
Replaces the first occurrence of a substring matching the regular expression pattern with a replacement string. Syntax:haystack
(String
): The input string to perform the replacement on.pattern
(String
): The regular expression pattern to match (in re2 syntax).replacement
(String
): The string to replace the matched substring with.
- The
replacement
string can contain substitutions\0
-\9
:\1
-\9
correspond to capturing groups (submatches) in the pattern.\0
corresponds to the entire match.
- To use a literal
\
character in the pattern or replacement, escape it with\\
. - String literals may require additional escaping.
replaceRegexpAll
Replaces all occurrences of a substring matching the regular expression pattern with the replacement string. Syntax:- REGEXP_REPLACE
haystack
(String
): The input string to perform replacements on.pattern
(String
): The regular expression pattern to match (in re2 syntax).replacement
(String
): The string to replace matched substrings with.
String
with all occurrences of the pattern replaced.
- The
replacement
string can contain substitutions\0
-\9
:\0
corresponds to the entire match\1
-\9
correspond to captured groups (submatches)
- To use a literal backslash in the pattern or replacement, escape it with another backslash:
\\
- String literals may require additional escaping
If a regular expression matches an empty substring, the replacement is not made more than once. For example:Result:The prefix is added only once at the beginning, not before every character.
regexpQuoteMeta
Escapes special characters in a string for use in regular expressions. Syntax:s
(String
): The input string to be escaped.
String
]
Description:
This function adds a backslash before the following characters that have special meaning in regular expressions:
\0
, \
, |
, (
, )
, ^
, $
, .
, [
, ]
, ?
, *
, +
, {
, :
, -
It differs slightly from
re2::RE2::QuoteMeta
:- It escapes the zero byte as
\0
instead of\x00
- It only escapes the required characters
regexpQuoteMeta
escapes the special regex characters +
, (
, )
, %
, and !
in the taco special description, making it safe to use in a regular expression pattern.
format
Formats a string using specified pattern and arguments, similar to Python’s string formatting. Syntax:pattern
(String
): A string containing replacement fields surrounded by curly braces{}
.arg1, arg2, ...
(Any data type): Values to be inserted into the pattern.
String
]
Details:
- Replacement fields in the pattern can be:
- Numbered (starting from zero):
{0}
,{1}
, etc. - Empty (implicitly numbered):
{}
,{}
, etc.
- Numbered (starting from zero):
- Literal braces can be escaped using double braces:
{{
and}}
. - Anything not in braces is treated as literal text.
format
function creates a sentence describing a taco order by inserting the provided arguments into the pattern string.
This function is useful for creating dynamic strings or formatting output in a flexible way.
translate
Replaces characters in a string using a one-to-one character mapping defined byfrom
and to
strings.
Syntax:
s
(String
): The input string to be transformed.from
(String
): A constant ASCII string defining the characters to be replaced.to
(String
): A constant ASCII string defining the replacement characters.
String
]
from
andto
must be constant ASCII strings of the same length.- Non-ASCII characters in the original string are not modified.
- ‘a’ is replaced with ‘A’
- ‘i’ is replaced with ‘I’
- ‘c’ is replaced with ‘K’
translateUTF8
Replaces characters in a UTF-8 encoded string using a one-to-one character mapping defined byfrom
and to
strings.
Syntax:
s
(String
): The input UTF-8 encoded string.from
(String
): A UTF-8 encoded string specifying characters to be replaced.to
(String
): A UTF-8 encoded string specifying replacement characters.
String
with characters replaced according to the mapping.
Example:
from
andto
must be UTF-8 encoded strings of the same length.- Characters in the original string that don’t appear in
from
are not modified. - If a character appears multiple times in
from
, only the first occurrence is used.
printf
Formats a string with specified values, similar to the C++ printf function. Syntax:format
(String
): A format string containing format specifiers.arg1, arg2, ...
: Values to be formatted (strings, integers, floating-point numbers, etc.).
- Format specifiers start with
%
and are followed by a type character (e.g.,%s
for string,%d
for integer). %%
represents a literal%
character.
%s
is replaced with string arguments (‘Juan’, ‘al pastor’, ‘pineapple’)%d
is replaced with the integer argument (3)
Anything not contained in a format specifier is treated as literal text and copied verbatim to the output.