String Replacement functions
Find and replace substrings within text.
String replacement functions provide powerful tools for manipulating text data. These functions allow you to:
- Search for patterns or substrings
- Replace content
- Standardize formats
- Clean up inconsistencies
- Transform text for analysis
ClickHouse function reference
overlay
Replaces part of a string with another string, starting at a specified position.
Syntax:
Arguments:
s
(String
): The input string.replace
(String
): The replacement string.offset
(Int
): The 1-based starting position for the replacement. If negative, it’s counted from the end ofs
.length
(Int
, optional): The number of characters to replace. If omitted, it defaults to the length ofreplace
.
Returns:
A String
with the specified portion replaced.
Example:
Result:
In this example:
modified_taco
replaces ‘taco’ with ‘soft’ starting at position 9.protein_swap
replaces ‘beef’ (4 characters) with ‘chicken’ starting at position 7.
The overlay
function is case-sensitive and works with byte positions, not character positions, for multi-byte encodings.
overlayUTF8
Replaces part of a UTF-8 encoded string with another string, starting at a specified position.
Syntax:
Arguments:
s
(String
): The input UTF-8 encoded string.replace
(String
): The replacement string.offset
(Int
): The 1-based position where the replacement starts. If negative, it’s counted from the end of the string.length
(Int
, optional): The number of characters to replace. If omitted, it defaults to the length of thereplace
string.
Returns:
A String
with the specified portion replaced.
Example:
Result:
In this example, ‘jalapeño’ is replaced with ‘habanero’ in the original string.
This function assumes the input string contains valid UTF-8 encoded text. If this assumption is violated, the result is undefined and no exception is thrown.
replaceOne
Replaces the first occurrence of a substring in a string.
Syntax:
Arguments:
haystack
(String
): The original string to perform replacement on.pattern
(String
): The substring to search for.replacement
(String
): The string to replace the found substring with.
Returns:
A string with the first occurrence of pattern
replaced by replacement
.
Example:
Result:
In this example, replaceOne
replaces the first occurrence of ‘beef’ with ‘chicken’ in the original string.
This function is case-sensitive. If you need case-insensitive replacement, consider using replaceRegexpOne
with the (?i)
flag.
replaceAll
Replaces all occurrences of a substring in a string with another substring.
Syntax:
Alias:
- replace
Arguments:
haystack
(String
): The original string in which replacements will be made.pattern
(String
): The substring to be replaced.replacement
(String
): The string to replace the pattern with.
Returns:
A new string with all occurrences of pattern
replaced by replacement
. [String
]
Example:
Result:
In this example, replaceAll
replaces all occurrences of ‘Taco’ with ‘Burrito’ in the given string. Note that it only replaces the exact match ‘Taco’ and not ‘tacos’.
This function is case-sensitive. If you need case-insensitive replacement, you might want to use replaceRegexpAll
with an appropriate regular expression.
replaceRegexpOne
Replaces the first occurrence of a substring matching the regular expression pattern with a replacement string.
Syntax:
Arguments:
haystack
(String
): The input string to perform the replacement on.pattern
(String
): The regular expression pattern to match (in re2 syntax).replacement
(String
): The string to replace the matched substring with.
Returns:
A string with the first occurrence of the pattern replaced.
- The
replacement
string can contain substitutions\0
-\9
:\1
-\9
correspond to capturing groups (submatches) in the pattern.\0
corresponds to the entire match.
- To use a literal
\
character in the pattern or replacement, escape it with\\
. - String literals may require additional escaping.
Example:
Result:
In this example, we enhance the taco description by adding adjectives to the ingredients using capturing groups and backreferences.
replaceRegexpAll
Replaces all occurrences of a substring matching the regular expression pattern with the replacement string.
Syntax:
Alias:
- REGEXP_REPLACE
Arguments:
haystack
(String
): The input string to perform replacements on.pattern
(String
): The regular expression pattern to match (in re2 syntax).replacement
(String
): The string to replace matched substrings with.
Returns:
A String
with all occurrences of the pattern replaced.
- The
replacement
string can contain substitutions\0
-\9
:\0
corresponds to the entire match\1
-\9
correspond to captured groups (submatches)
- To use a literal backslash in the pattern or replacement, escape it with another backslash:
\\
- String literals may require additional escaping
Example:
Result:
In this example, we upgrade all our tacos to be delicious by adding the word before each occurrence of “taco”.
If a regular expression matches an empty substring, the replacement is not made more than once. For example:
Result:
The prefix is added only once at the beginning, not before every character.
regexpQuoteMeta
Escapes special characters in a string for use in regular expressions.
Syntax:
Arguments:
s
(String
): The input string to be escaped.
Returns:
A string with special regex characters escaped. [String
]
Description:
This function adds a backslash before the following characters that have special meaning in regular expressions:
\0
, \
, |
, (
, )
, ^
, $
, .
, [
, ]
, ?
, *
, +
, {
, :
, -
It differs slightly from re2::RE2::QuoteMeta
:
- It escapes the zero byte as
\0
instead of\x00
- It only escapes the required characters
Example:
Result:
In this example, regexpQuoteMeta
escapes the special regex characters +
, (
, )
, %
, and !
in the taco special description, making it safe to use in a regular expression pattern.
format
Formats a string using specified pattern and arguments, similar to Python’s string formatting.
Syntax:
Arguments:
pattern
(String
): A string containing replacement fields surrounded by curly braces{}
.arg1, arg2, ...
(Any data type): Values to be inserted into the pattern.
Returns:
A formatted string. [String
]
Details:
- Replacement fields in the pattern can be:
- Numbered (starting from zero):
{0}
,{1}
, etc. - Empty (implicitly numbered):
{}
,{}
, etc.
- Numbered (starting from zero):
- Literal braces can be escaped using double braces:
{{
and}}
. - Anything not in braces is treated as literal text.
Example:
Result:
In this example, the format
function creates a sentence describing a taco order by inserting the provided arguments into the pattern string.
This function is useful for creating dynamic strings or formatting output in a flexible way.
translate
Replaces characters in a string using a one-to-one character mapping defined by from
and to
strings.
Syntax:
Arguments:
s
(String
): The input string to be transformed.from
(String
): A constant ASCII string defining the characters to be replaced.to
(String
): A constant ASCII string defining the replacement characters.
Returns:
A transformed string with characters replaced according to the mapping. [String
]
from
andto
must be constant ASCII strings of the same length.- Non-ASCII characters in the original string are not modified.
Example:
Result:
In this example:
- ‘a’ is replaced with ‘A'
- 'i’ is replaced with ‘I'
- 'c’ is replaced with ‘K’
This function is useful for simple character-based transformations, such as changing letter case or replacing specific characters in strings.
translateUTF8
Replaces characters in a UTF-8 encoded string using a one-to-one character mapping defined by from
and to
strings.
Syntax:
Arguments:
s
(String
): The input UTF-8 encoded string.from
(String
): A UTF-8 encoded string specifying characters to be replaced.to
(String
): A UTF-8 encoded string specifying replacement characters.
Returns:
A String
with characters replaced according to the mapping.
Example:
Result:
In this example, the Spanish letter ‘ñ’ is replaced with ‘n’ in the taco name.
from
andto
must be UTF-8 encoded strings of the same length.- Characters in the original string that don’t appear in
from
are not modified. - If a character appears multiple times in
from
, only the first occurrence is used.
This function is particularly useful for normalizing text, removing diacritics, or performing simple character substitutions in multilingual contexts, such as menu translations or ingredient lists for international taco recipes.
printf
Formats a string with specified values, similar to the C++ printf function.
Syntax:
Arguments:
format
(String
): A format string containing format specifiers.arg1, arg2, ...
: Values to be formatted (strings, integers, floating-point numbers, etc.).
Returns:
A formatted string.
Format Specifiers:
- Format specifiers start with
%
and are followed by a type character (e.g.,%s
for string,%d
for integer). %%
represents a literal%
character.
Example:
Result:
In this example, we format a string describing a taco order:
%s
is replaced with string arguments (‘Juan’, ‘al pastor’, ‘pineapple’)%d
is replaced with the integer argument (3)
Anything not contained in a format specifier is treated as literal text and copied verbatim to the output.
Was this page helpful?