Array functions in ClickHouse provide capabilities for manipulating and analyzing array data types.

Some key categories of array functions include:

  • Creation and basic operations: Functions like array(), range(), and arrayConcat() for creating and combining arrays.
  • Element access and search: Functions like arrayElement(), indexOf(), and has() for working with individual array elements.
  • Transformation: Functions like arrayMap(), arrayFilter(), and arrayReverse() for modifying array contents.
  • Aggregation: Functions like arrayReduce(), arraySum(), and arrayAvg() for computing summary statistics on arrays.
  • Set operations: Functions like arrayUniq(), arrayIntersect(), and arrayDistinct() for working with arrays as sets.
  • Higher-order functions: Functions like arrayMap(), arrayFilter(), and arrayExists() that take lambda functions as arguments for custom array processing.

ClickHouse function reference

empty

Checks whether the input array is empty.

Syntax:

empty(arr)

Arguments:

  • arr - The array to check. [Array]

Returns:

1 if the array is empty, 0 otherwise. [UInt8]

Example:

SELECT
  empty(['cheese', 'lettuce', 'salsa']) AS is_plain_taco,
  empty([]) AS is_really_plain_taco;

Result:

| is_plain_taco | is_really_plain_taco |
|---------------|----------------------|
| 0             | 1                    |

In this example, empty checks if each taco_toppings array is empty. The result 1 indicates a plain taco with no toppings, while 0 indicates a taco with toppings.

notEmpty

Checks whether the input array is non-empty.

Syntax:

notEmpty(arr)

Arguments:

  • arr - Array to check.

Returns:

1 if the array is empty, 0 otherwise. [UInt8]

Example:

SELECT
  notEmpty(['cheese', 'lettuce', 'salsa']) AS has_toppings,
  notEmpty([]) AS plain_taco_has_toppings;

Result:

| has_toppings | plain_taco_has_toppings  |
|--------------|--------------------------|
| 1            | 0                        |

In this example, notEmpty checks if each taco order has any toppings. It returns 1 for orders with toppings and 0 for plain tacos with no toppings.

length

Returns the number of elements in the array. The result type is UInt64. The function also works for strings.

Syntax:

length(arr)

Arguments:

  • arr - The array to get the length of. [Array]

Returns

  • The number of elements in the array. [UInt64]

Example:

SELECT
  length(['beef', 'chicken', 'fish']) AS taco_fillings_count;

Result:

| taco_fillings_count |
|---------------------|
| 3                   |

In this example, length returns 3, which is the number of elements in the array of taco fillings.

emptyArrayUInt8

Returns an empty array of UInt8 type.

Syntax:

emptyArrayUInt8()

Arguments:

None.

Returns:

An empty array of UInt8 type.

Example:

SELECT
  emptyArrayUInt8() AS empty_taco_toppings;

Result:

| empty_taco_toppings |
|---------------------|
| []                  |

In this example, emptyArrayUInt8() returns an empty array that could represent a taco with no toppings.

emptyArrayUInt16

Returns an empty UInt16 array.

Syntax:

emptyArrayUInt16()

Arguments:

None.

Returns

An empty array of UInt16 type.

Example:

SELECT
  emptyArrayUInt16() AS empty_taco_toppings;

Result:

| empty_taco_toppings |
|---------------------|
| []                  |

In this example, emptyArrayUInt16() returns an empty array that could represent a taco with no toppings.

emptyArrayUInt32

Returns an empty UInt32 array.

Syntax:

emptyArrayUInt32()

Arguments:

None.

Returns

An empty array of type Array(UInt32).

Example:

SELECT
  emptyArrayUInt32() AS empty_taco_toppings;

Result:

| empty_taco_toppings |
|---------------------|
| []                  |

In this example, emptyArrayUInt32() returns an empty array that could represent a taco with no toppings.

emptyArrayUInt64

Returns an empty UInt64 array.

Syntax:

emptyArrayUInt64()

Arguments:

None.

Returns

An empty array of type Array(UInt64).

Example:

SELECT
  emptyArrayUInt64() AS empty_taco_orders;

Result:

| empty_taco_orders  |
|--------------------|
| []                 |

In this example, emptyArrayUInt64() returns an empty array that could be used to represent an empty list of taco order IDs, where each order ID is a UInt64 value.

emptyArrayInt8

Returns an empty Int8 array.

Syntax:

emptyArrayInt8()

Arguments:

None.

Returns

An empty array of type Array(Int8).

Example:

SELECT
  emptyArrayInt8() AS empty_taco_toppings;

Result:

| empty_taco_toppings |
|---------------------|
| []                  |

In this example, emptyArrayInt8() returns an empty array that could represent a taco with no toppings.

emptyArrayInt16

Returns an empty Int16 array.

Syntax:

emptyArrayInt16()

Arguments:

None.

Returns

An empty array of type Array(Int16).

Example:

SELECT
  emptyArrayInt16() AS empty_taco_orders;

Result:

| empty_taco_orders  |
|--------------------|
| []                 |

In this example, emptyArrayInt16() returns an empty array that could represent an empty list of taco order IDs, where each order ID is stored as an Int16 value.

emptyArrayInt32

Returns an empty Int32 array.

Syntax:

emptyArrayInt32()

Arguments:

None.

Returns

An empty array of type Array(Int32).

Example:

SELECT
  emptyArrayInt32() AS empty_taco_orders;

Result:

| empty_taco_orders  |
|--------------------|
| []                 |

In this example, emptyArrayInt32() returns an empty array that could represent an empty list of taco order IDs, where each order ID would typically be stored as an Int32 value.

emptyArrayInt64

Returns an empty Int64 array.

Syntax:

emptyArrayInt64()

Arguments:

None.

Returns

An empty array of type Array(Int64).

Example:

SELECT
  emptyArrayInt64() AS empty_taco_orders;

Result:

| empty_taco_orders  |
|--------------------|
| []                 |

In this example, emptyArrayInt64() returns an empty array that could represent an empty list of taco order IDs, where each order ID would typically be stored as an Int64 value.

emptyArrayFloat32

Returns an empty Float32 array.

Syntax:

emptyArrayFloat32()

Arguments:

None.

Returns

An empty array of type Array(Float32).

Example:

SELECT
  emptyArrayFloat32() AS empty_taco_prices;

Result:

| empty_taco_prices  |
|--------------------|
| []                 |

In this example, emptyArrayFloat32() returns an empty array that could be used to store Float32 values representing taco prices. The result is an empty array [] of type Array(Float32).

emptyArrayFloat64

Returns an empty Float64 array.

Syntax:

emptyArrayFloat64()

Arguments:

None.

Returns

An empty array of type Array(Float64).

Example:

SELECT
  emptyArrayFloat64() AS empty_taco_prices;

Result:

| empty_taco_prices  |
|--------------------|
| []                 |

In this example, emptyArrayFloat64() returns an empty array that could be used to store Float64 values representing taco prices. The result is an empty array [], indicating there are currently no taco prices stored.

emptyArrayDate

Returns an empty Date array.

Syntax:

emptyArrayDate()

Arguments:

None.

Returns

An empty array of Date type.

Example:

SELECT
  emptyArrayDate() AS empty_taco_dates;

Result:

| empty_taco_dates  |
|-------------------|
| []                |

In this example, emptyArrayDate() returns an empty array that could be used to store dates related to taco orders, but currently contains no elements.

emptyArrayDateTime

Returns an empty DateTime array.

Syntax:

emptyArrayDateTime()

Arguments:

None.

Returns

An empty DateTime array.

Example:

SELECT
  emptyArrayDateTime() AS empty_taco_order_times;

Result:

| empty_taco_order_times |
|------------------------|
| []                     |

In this example, emptyArrayDateTime() returns an empty array that could be used to store DateTime values representing taco order times, but it currently contains no elements.

emptyArrayString

Returns an empty String array.

Syntax:

emptyArrayString()

Arguments:

This function does not take any arguments.

Returns:

An empty String array.

Example:

SELECT
  emptyArrayString() AS empty_taco_toppings;

Result:

| empty_taco_toppings |
|---------------------|
| []                  |

In this example, emptyArrayString() returns an empty array that could represent a taco with no toppings.

emptyArrayToSingle

Accepts an empty array and returns a one-element array containing the default value for the array’s data type.

Syntax:

emptyArrayToSingle(arr)

Arguments:

  • arr - An empty array of any type.

Returns

  • A one-element array containing the default value for the array’s data type.

Example:

SELECT
  emptyArrayToSingle(emptyArrayInt32()) AS int_array,
  emptyArrayToSingle(emptyArrayString()) AS string_array,
  emptyArrayToSingle(emptyArrayToSingle(emptyArrayInt32())) AS nested_array

Result:

| int_array | string_array | nested_array |
|-----------|--------------|--------------|
| [0]       | ['']         | [0]          |

In this example:

  • int_array contains a single default integer value (0)
  • string_array contains a single default string value (empty string)
  • nested_array demonstrates nested usage, resulting in an array containing a single default integer value (0)

This function is useful when you need to ensure an array always has at least one element, even if the original array is empty.

range

Returns an array of numbers from start to end - 1 by step.

Syntax:

range([start, ] end [, step])

Arguments:

  • start - The first element of the array. Optional. Default value: 0.
  • end - The number before which the array is constructed. Required.
  • step - The incremental step between each element in the array. Optional. Default value: 1.

Returns

  • Array of numbers from start to end - 1 by step.

Example:

SELECT
  range(3, 9, 2) AS taco_counts;

Result:

| taco_counts |
|-------------|
| [3,5,7]     |

In this example, range generates an array representing possible taco order quantities, starting from 3 tacos, up to but not including 9 tacos, incrementing by 2 each time.

  • All arguments must be integers.
  • The function throws an exception if the result would be an array with more elements than specified by the function_range_max_elements_in_block setting.
  • Returns Null if any argument has Nullable(Nothing) type. An exception is thrown if any argument has Null value (Nullable(T) type).

array

Creates an array from the function arguments.

Syntax:

array(x1, x2, ..., xN)

Alternatively, you can use the square bracket syntax:

[x1, x2, ..., xN]

Arguments:

  • x1, x2, …, xN - Values to be included in the array. All arguments must be constants and have types that have the smallest common type. At least one argument must be passed.

Returns:

An Array(T) type result, where T is the smallest common type of the passed arguments.

Example:

SELECT
  array(1, 2, 3) AS numbers,
  ['beef', 'chicken', 'fish'] AS taco_proteins;

Result:

| numbers    | taco_proteins             |
|------------|---------------------------|
| [1, 2, 3]  | ['beef','chicken','fish'] |

In this example:

  • numbers is created using the array function syntax.
  • taco_proteins is created using the square bracket syntax.

You can’t use this function to create an empty array. To create an empty array of a specific type, use one of the emptyArray* functions (e.g., emptyArrayInt32()).

arrayWithConstant

Creates an array of specified length filled with a constant value.

Syntax:

arrayWithConstant(length, value)

Arguments:

  • length - The length of the array to create. [UInt*]
  • value - The constant value to fill the array with. Can be any data type.

Returns

  • An array of the specified length filled with the given constant value. [Array]

Example:

SELECT
  arrayWithConstant(5, 'taco') AS taco_array;

Result:

| taco_array                           |
|--------------------------------------|
| ['taco','taco','taco','taco','taco'] |

In this example, the function creates an array of 5 elements, each containing the string ‘taco’.

The type of the resulting array elements matches the type of the value argument. If you need an array of a specific type, you may need to cast the value argument to that type.

arrayConcat

Combines multiple arrays into a single array.

Syntax:

arrayConcat(arr1, arr2, ..., arrN)

Arguments:

  • arr1, arr2, …, arrN - Arrays of any type to concatenate.

Returns:

An array containing all elements from the input arrays in the order they were provided.

Example:

SELECT
  arrayConcat(['beef', 'chicken'], ['lettuce', 'tomato'], ['cheese']) AS taco_ingredients,
  arrayConcat([1, 2], [3, 4], [5, 6]) AS numbers;

Result:

| taco_ingredients                                   | numbers            |
|----------------------------------------------------|--------------------|
| ['beef', 'chicken', 'lettuce', 'tomato', 'cheese'] | [1, 2, 3, 4, 5, 6] |

In this example, arrayConcat combines three arrays of taco ingredients into a single array containing all the ingredients.

arrayElement

Returns the element at the specified index in the array.

Syntax:

arrayElement(arr, n)

Alternatively, you can use the [] operator:

arr[n]

Arguments:

  • arr (Array): The input array.
  • n (integer): The index of the element to return (1-based).

Returns:

The element at the specified index in the array.

Example:

SELECT
  taco_order_id,
  taco_toppings,
  arrayElement(taco_toppings, 2) AS second_topping
FROM
  taco_orders;

Result:

| taco_order_id | taco_toppings         | second_topping |
|---------------|-----------------------|----------------|
| 1             | ['cheese','lettuce']  | lettuce        |
| 2             | []                    | NULL           |
| 3             | ['salsa','guacamole'] | guacamole      |

In this example:

  • second_topping returns the second element of the taco_toppings array.
  • For order 2, since the array is empty, NULL is returned.

If the index is out of bounds (less than 1 or greater than the array length), the function returns NULL.

has

Checks whether an array contains a specific element.

Syntax:

has(arr, elem)

Arguments:

  • arr (Array): The array to search in.
  • elem (any): The element to search for.

Returns:

A boolean value:

  • 1 if the element is found in the array
  • 0 if the element is not found

Example:

SELECT
  taco_order_id,
  taco_toppings,
  has(taco_toppings, 'salsa') AS has_salsa
FROM
  taco_orders;

Result:

| taco_order_id | taco_toppings         | has_salsa |
|---------------|-----------------------|-----------|
| 1             | ['cheese','lettuce']  | 0         |
| 2             | []                    | 0         |
| 3             | ['salsa','guacamole'] | 1         |

In this example, the has function checks if ‘salsa’ is present in the taco_toppings array for each order. It returns 1 for the third order which includes ‘salsa’, and 0 for the others.

hasAll

Checks whether one array is a subset of another.

Syntax:

hasAll(set, subset)

Arguments:

  • set: Array of any type with a set of elements.
  • subset: Array of any type that shares a common supertype with set containing elements that should be tested to be a subset of set.

Returns

  • 1, if set contains all of the elements from subset.
  • 0, otherwise.

Example:

SELECT hasAll(
    ['beef', 'cheese', 'lettuce', 'tomato'],
    ['cheese', 'lettuce']
) AS has_toppings;

Result:

| has_toppings |
|--------------|
| 1            |

In this example, the function checks if the taco ingredients array [‘beef’, ‘cheese’, ‘lettuce’, ‘tomato’] contains all the elements from the subset [‘cheese’, ‘lettuce’]. Since it does, the function returns 1.

  • An empty array is a subset of any array.
  • NULL is processed as a value.
  • Order of values in both arrays does not matter.
  • Raises an exception NO_COMMON_TYPE if the set and subset elements do not share a common supertype.

hasAny

Checks whether two arrays have any common elements.

Syntax:

hasAny(array1, array2)

Arguments:

  • array1 (Array): The first array to check.
  • array2 (Array): The second array to check.

Returns:

1 if the arrays have at least one common element, 0 otherwise. [UInt8]

Example:

SELECT hasAny(
  ['beef', 'chicken', 'pork'],
  ['lettuce', 'tomato', 'beef']
) AS has_common_ingredient;

Result:

| has_common_ingredient |
|-----------------------|
| 1                     |

In this example, hasAny checks if there are any common elements between the meat options array and the vegetable options array. It returns 1 because ‘beef’ is present in both arrays, indicating that at least one taco order has both a meat and vegetable ingredient in common.

hasSubstr

Checks whether all the elements of array2 appear in array1 in the same exact order.

Syntax:

hasSubstr(array1, array2)

Arguments:

  • array1 (Array): The array to search in.
  • array2 (Array): The array to search for.

Returns:

  • 1 if array2 is found as a contiguous subsequence within array1.
  • 0 otherwise.

Example:

SELECT hasSubstr(
  ['beef', 'cheese', 'lettuce', 'tomato'],
  ['cheese', 'lettuce']
) AS has_toppings;

Result:

| has_toppings |
|--------------|
| 1            |

In this example, hasSubstr checks if the taco toppings ‘cheese’ and ‘lettuce’ appear together in that order within the full list of taco ingredients. The result is 1, indicating that these toppings are present as a contiguous subsequence.

  • The function returns 1 if array2 is empty.
  • Order of elements matters, and NULL values are processed as regular values.

indexOf

Returns the index of the first occurrence of the specified element in the array, or 0 if the element is not found. Array indices start at 1.

Syntax:

indexOf(arr, x)

Arguments:

  • arr - The array to search in.
  • x - The element to search for.

Returns:

The index of the first occurrence of x in arr, or 0 if not found.

Example:

SELECT
  indexOf(['beef', 'chicken', 'fish', 'beef'], 'beef') AS beef_index,
  indexOf(['salsa', 'guacamole', 'sour cream'], 'cheese') AS cheese_index;

Result:

| beef_index | cheese_index |
|------------|--------------|
| 1          | 0            |

In this example:

  • ‘beef’ is first found at index 1 in the taco fillings array.
  • ‘cheese’ is not found in the toppings array, so 0 is returned.

arrayCount

Counts the number of elements in an array that meet a specified condition.

Syntax:

arrayCount([func,] arr)

Arguments:

  • func (optional): A lambda function that defines the condition to check for each element.
  • arr (Array): The array to count elements from.

Returns:

The number of elements that meet the condition (or non-zero elements if no function is specified).

Example:

SELECT
  taco_order_id,
  taco_toppings,
  arrayCount(x -> x = 'cheese', taco_toppings) AS cheese_count
FROM
  taco_orders;

Result:

| taco_order_id | taco_toppings                 | cheese_count |
|---------------|-------------------------------|--------------|
| 1             | ['cheese', 'lettuce', 'beef'] | 1            |
| 2             | ['salsa', 'guacamole']        | 0            |
| 3             | ['cheese', 'cheese', 'onion'] | 2            |

In this example, arrayCount is used to count how many times ‘cheese’ appears in the taco_toppings array for each taco order. The function x -> x = 'cheese' checks if each topping is ‘cheese’.

arrayDotProduct

Calculates the dot product of two arrays.

Syntax:

arrayDotProduct(vector1, vector2)

Alternatively, you can use the aliases:

scalarProduct(vector1, vector2)
dotProduct(vector1, vector2)

Arguments:

  • vector1 (Array or Tuple of numeric values): The first vector.
  • vector2 (Array or Tuple of numeric values): The second vector.

Returns:

A numeric value representing the dot product of the two input vectors.

Example:

SELECT
  arrayDotProduct([1, 2, 3], [4, 5, 6]) AS dot_product

Result:

| dot_product |
|-------------|
| 32          |

In this example:

  • The arrayDotProduct calculates the dot product of two numeric arrays: 14 + 25 + 3*6 = 32.
  • The sizes of the two vectors must be equal.
  • Arrays and Tuples may contain mixed element types.
  • The return type is determined by the type of the arguments. If Arrays or Tuples contain mixed element types, the result type is the supertype.

countEqual

Counts the number of elements in the array equal to the specified value.

Syntax:

countEqual(arr, x)

Arguments:

  • arr (Array): The array to search in.
  • x (any): The value to count occurrences of.

Returns:

An integer representing the count of elements equal to x.

Example:

SELECT
  taco_order_id,
  taco_toppings,
  countEqual(taco_toppings, 'cheese') AS cheese_count
FROM
  taco_orders;

Result:

| taco_order_id | taco_toppings                 | cheese_count |
|---------------|-------------------------------|--------------|
| 1             | ['cheese', 'lettuce']         | 1            |
| 2             | ['cheese', 'salsa', 'cheese'] | 2            |
| 3             | ['guacamole', 'salsa']        | 0            |

In this example:

  • For the first order, ‘cheese’ appears once in the toppings.
  • For the second order, ‘cheese’ appears twice.
  • For the third order, ‘cheese’ does not appear at all.

The countEqual function allows you to easily count specific toppings across different taco orders.

arrayEnumerate

Returns an array of the same size as the input array, containing the numbers from 1 to the length of the input array.

Syntax:

arrayEnumerate(arr)

Arguments:

  • arr - The input array of any type.

Returns

An array of UInt32 values from 1 to the length of the input array.

Example:

SELECT
  arrayEnumerate(['salsa', 'guacamole', 'sour cream']) AS taco_topping_numbers;

Result:

| taco_topping_numbers |
|----------------------|
| [1,2,3]              |

This function is often used with ARRAY JOIN. It allows counting something just once for each array after applying ARRAY JOIN. For example:

SELECT
  count() AS total_orders,
  countIf(topping_num = 1) AS orders_with_first_topping
FROM
  taco_orders
ARRAY JOIN
  taco_toppings AS topping,
  arrayEnumerate(taco_toppings) AS topping_num
WHERE
  order_id = 12345;

Result:

| total_orders | orders_with_first_topping |
|--------------|---------------------------|
| 3            | 1                         |

In this example, total_orders is the number of toppings (the rows received after applying ARRAY JOIN), and orders_with_first_topping is the number of orders that have at least one topping.

arrayEnumerateUniq

Returns an array the same size as the source array, indicating for each element what its position is among elements with the same value.

Syntax:

arrayEnumerateUniq(arr, ...)

Arguments:

  • arr - Array to enumerate.
  • ... - Additional arrays to consider when determining uniqueness (optional).

Returns

An array of the same size as the input, containing the position of each element among elements with the same value.

Example:

SELECT
  arrayEnumerateUniq(['salsa', 'guacamole', 'salsa', 'sour cream']) AS taco_toppings_enum;

Result:

| taco_toppings_enum |
|--------------------|
| [1,1,2,1]          |

In this example:

  • The first ‘salsa’ gets position 1
  • ’guacamole’ gets position 1 (first unique value)
  • The second ‘salsa’ gets position 2 (second occurrence)
  • ‘sour cream’ gets position 1 (first unique value)

This function is useful when using ARRAY JOIN and aggregation of array elements. It allows counting distinct elements while preserving information about their order of appearance.

arrayEnumerateUniqRanked

Returns an array the same size as the source array, indicating for each element what its position is among elements with the same value. It allows for enumeration of a multidimensional array with the ability to specify how deep to look inside the array.

Syntax:

arrayEnumerateUniqRanked(clear_depth, arr, max_array_depth)

Arguments:

  • clear_depth (integer): Enumerate elements at the specified level separately. Must be positive and less than or equal to max_array_depth.
  • arr (Array): N-dimensional array to enumerate.
  • max_array_depth (integer): The maximum effective depth. Must be positive and less than or equal to the depth of arr.

Returns:

An array of the same size as the input array, with elements indicating their position among elements with the same value.

Example:

SELECT
  arrayEnumerateUniqRanked(1, [[1,2,3],[2,2,1],[3]], 2) AS res;

Result:

| res                   |
|-----------------------|
| [[1,1,1],[2,3,2],[2]] |

In this example:

  • For the first row [1,2,3], the result is [1,1,1] because each element is encountered for the first time.
  • For the second row [2,2,1], the result is [2,3,2] because 2 is encountered for the second and third time, and 1 is encountered for the second time.
  • For the third row [3], the result is [2] because 3 is encountered for the second time.

Changing clear_depth to 2 would result in elements being enumerated separately for each row:

SELECT
  arrayEnumerateUniqRanked(2, [[1,2,3],[2,2,1],[3]], 2) AS res;

Result:

| res                   |
|-----------------------|
| [[1,1,1],[1,2,1],[1]] |

This function is useful for analyzing the uniqueness and order of elements in nested arrays, particularly when working with complex data structures in analytical queries.

arrayPopBack

Removes the last item from the array.

Syntax:

arrayPopBack(`Array`)

Arguments:

  • array - The input array to remove the last element from.

Returns:

An array with the last element removed.

Example:

SELECT
  arrayPopBack(['beef', 'chicken', 'fish']) AS taco_fillings;

Result:

| taco_fillings       |
|---------------------|
| ['beef', 'chicken'] |

In this example, the arrayPopBack function removes the last element ‘fish’ from the array of taco fillings, returning an array with only ‘beef’ and ‘chicken’.

arrayPopFront

Removes the first item from the array.

Syntax:

arrayPopFront(`Array`)

Arguments:

  • array - The input array to remove the first element from.

Returns:

An array with the first element removed.

Example:

SELECT
  arrayPopFront(['beef', 'chicken', 'fish']) AS taco_fillings;

Result:

| taco_fillings       |
|---------------------|
| ['chicken', 'fish'] |

In this example, ‘beef’ is removed from the beginning of the array of taco fillings, leaving only ‘chicken’ and ‘fish’.

arrayPushBack

Adds one item to the end of the array.

Syntax:

arrayPushBack(array, single_value)

Arguments:

  • array (Array): The original array to add an element to.
  • single_value (any): The value to add to the end of the array. Must match the type of elements in the array.

Returns:

An array with the new element added to the end.

Example:

SELECT
  arrayPushBack(['cheese', 'lettuce'], 'salsa') AS taco_toppings;

Result:

| taco_toppings                |
|------------------------------|
| ['cheese','lettuce','salsa'] |

In this example, ‘salsa’ is added as a new topping to the end of the existing taco toppings array.

  • The function adds a NULL element to an array if single_value is NULL, and the type of array elements converts to Nullable.

arrayPushFront

Adds one element to the beginning of the array.

Syntax:

arrayPushFront(array, single_value)

Arguments:

  • array: The original array to add an element to.
  • single_value: The value to add to the beginning of the array. Must match the type of elements in the array.

Returns:

A new array with the single_value added to the beginning.

Example:

SELECT
  arrayPushFront(['salsa', 'guacamole'], 'cheese') AS taco_toppings;

Result:

| taco_toppings                  |
|--------------------------------|
| ['cheese','salsa','guacamole'] |

In this example, ‘cheese’ is added to the beginning of the taco toppings array.

  • Only values of the same type as the array elements can be added.
  • For numeric arrays, ClickHouse automatically sets the single_value type to match the array element type.
  • NULL can be added to an array, which will convert the array elements to Nullable type.

arrayResize

Resizes an array to the specified length, either truncating or extending it as needed.

Syntax:

arrayResize(array, size[, extender])

Arguments:

  • array (Array): The input array to resize.
  • size (integer): The desired length of the resulting array.
  • extender (optional): The value to use for extending the array if needed. If not specified, the default value for the array’s data type is used.

Returns:

An array of the specified length.

Example:

SELECT
  arrayResize(['beef', 'chicken', 'pork'], 5, 'veggie') AS extended_menu,
  arrayResize(['salsa', 'guacamole', 'sour cream', 'cheese', 'lettuce'], 3) AS limited_toppings

Result:

| extended_menu                               | limited_toppings                   |
|---------------------------------------------|------------------------------------|
| ['beef','chicken','pork','veggie','veggie'] | ['salsa','guacamole','sour cream'] |

In this example:

  • extended_menu extends the original array of taco meats to length 5, filling the extra spots with ‘veggie’.
  • limited_toppings truncates the original array of taco toppings to length 3.

arraySlice

Returns a slice of the array.

Syntax:

arraySlice(array, offset[, length])

Arguments:

  • array: The input array to slice.
  • offset: The starting index of the slice. Positive values indicate an offset from the beginning, negative values from the end. Indexing starts at 1.
  • length: Optional. The length of the slice. If omitted, returns elements from offset to the end of the array.

Returns:

An array containing the specified slice of elements.

Example:

SELECT
  arraySlice(['beef', 'chicken', 'pork', 'fish', 'veggie'], 2, 3) AS taco_fillings;

Result:

| taco_fillings                                 |
|-----------------------------------------------|
| ['chicken', 'pork', 'fish']                   |

In this example, we slice the array of taco fillings starting from the 2nd element and taking 3 elements. The result is an array with ‘chicken’, ‘pork’, and ‘fish’.

Array elements set to NULL are handled as normal values. If the offset or length goes beyond the array bounds, the function will return as many elements as possible without raising an error.

arrayShingles

Generates an array of “shingles”, i.e. consecutive sub-arrays with specified length from the input array.

Syntax:

arrayShingles(array, length)

Arguments:

  • array (Array): The input array to generate shingles from.
  • length (UInt8): The length of each shingle.

Returns:

An array of arrays, where each inner array is a shingle of the specified length.

Example:

SELECT
  arrayShingles(['beef', 'cheese', 'lettuce', 'tomato', 'onion'], 3) AS taco_shingles;

Result:

| taco_shingles                                                                            |
|------------------------------------------------------------------------------------------|
| [['beef','cheese','lettuce'],['cheese','lettuce','tomato'],['lettuce','tomato','onion']] |

In this example, arrayShingles generates 3-element shingles from the taco ingredients array. Each shingle represents a consecutive group of 3 ingredients from the original array.

arraySort

Sorts the elements of the input array in ascending order. If a function is specified, the sorting order is determined by the result of applying that function to the array elements.

Syntax:

arraySort([func,] arr)

Arguments:

  • func (optional): A function to apply to array elements to determine the sorting order.
  • arr: The array to be sorted.

Returns:

An array with the same elements as the input array, sorted in ascending order.

Example:

SELECT
  arraySort((x) -> -x, [3, 1, 2, 4]) AS sorted_tacos

Result:

| sorted_tacos |
|--------------|
| [4,3,2,1]    |

In this example, we sort an array of taco quantities in descending order by using a lambda function (x) -> -x. This effectively reverses the default ascending sort order.

  • NULL values are placed at the end of the sorted array.
  • For floating-point numbers, NaN comes after positive infinity.
  • arraySort is a higher-order function, meaning you can pass a lambda function as the first argument to customize the sorting behavior.
  • When sorting arrays of strings, the comparison is done lexicographically.

arrayPartialSort

Partially sorts the elements of the arr array in ascending order. If the func function is specified, sorting order is determined by the result of the func function applied to the elements of the array.

Syntax:

arrayPartialSort([func,] limit, arr)

Arguments:

  • func (optional): A lambda function that determines the sorting order. If not specified, elements are sorted directly.
  • limit: The number of elements to sort. Must be a positive integer.
  • arr: The array to be partially sorted.

Returns:

An array of the same size as the original array where elements in range [1..limit] are sorted in ascending order. Remaining elements (limit..N] contain elements in unspecified order.

Example:

SELECT
arrayPartialSort(3, ['salsa', 'guacamole', 'sour cream', 'cheese', 'lettuce']) AS taco_toppings

Result:

| taco_toppings                                             |
|-----------------------------------------------------------|
| ['cheese', 'guacamole', 'lettuce', 'salsa', 'sour cream'] |

In this example, the first 3 elements of the taco toppings array are sorted alphabetically, while the remaining elements stay in their original positions.

arrayReverseSort

Sorts the elements of the arr array in descending order. If the func function is specified, arr is sorted according to the result of the func function applied to the elements of the array, and then the sorted array is reversed.

Syntax:

arrayReverseSort(arr)

or

arrayReverseSort(func, arr)

Arguments:

  • arr (Array): The array to be sorted.
  • func (optional): A function to apply to array elements before sorting.

Returns:

An array with elements sorted in descending order.

Example:

SELECT
  arrayReverseSort(['salsa', 'guacamole', 'sour cream', 'cheese']) AS taco_toppings;

Result:

| taco_toppings                                  |
|------------------------------------------------|
| ['sour cream', 'salsa', 'guacamole', 'cheese'] |

In this example, the taco toppings are sorted in descending alphabetical order.

Example with function:

SELECT
  arrayReverseSort(x -> length(x), ['salsa', 'guacamole', 'sour cream', 'cheese']) AS taco_toppings_by_length;

Result:

| taco_toppings_by_length                        |
|------------------------------------------------|
| ['sour cream', 'guacamole', 'cheese', 'salsa'] |

Here, the taco toppings are sorted by their length in descending order.

The arrayReverseSort function is a higher-order function, allowing you to pass a lambda function as the first argument to customize the sorting criteria.

arrayPartialReverseSort

Partially sorts the elements of the arr array in descending order. If the func function is specified, arr is sorted according to the result of the func function applied to the elements of the array, and then the sorted array is reversed. The sorting is limited to the first limit elements.

Syntax:

arrayPartialReverseSort([func,] limit, arr, ...)

Arguments:

  • func (optional): A lambda function to apply to array elements before sorting.
  • limit: The number of elements to sort.
  • arr: The array to be partially sorted.

Returns:

An array of the same size as the original array where elements in range [1..limit] are sorted in descending order. Remaining elements (limit..N] contain elements in unspecified order.

Example:

SELECT
  arrayPartialReverseSort(3, ['salsa', 'guacamole', 'sour cream', 'cheese', 'lettuce']) AS taco_toppings;

Result:

| taco_toppings                                             |
|-----------------------------------------------------------|
| ['sour cream', 'salsa', 'guacamole', 'cheese', 'lettuce'] |

In this example, the first 3 elements of the taco toppings array are sorted in descending order, while the remaining elements (‘cheese’ and ‘lettuce’) are left in their original positions.

arrayShuffle

Shuffles the elements of an array randomly.

Syntax:

arrayShuffle(arr[, seed])

Arguments:

  • arr (Array): The input array to be shuffled.
  • seed (optional, integer): Seed value for the random number generator. If not provided, a random seed is used.

Returns:

An array with the same elements as the input array, but in a random order.

Example:

SELECT
  arrayShuffle(['beef', 'chicken', 'fish', 'veggie'], 42) AS shuffled_tacos;

Result:

| shuffled_tacos                        |
|---------------------------------------|
| ['fish', 'veggie', 'beef', 'chicken'] |

In this example, the array of taco fillings is shuffled randomly using a seed value of 42. The resulting array contains the same elements but in a different order. Note that using the same seed will always produce the same shuffled result.

The exact order of elements in the result may vary between ClickHouse versions or implementations. The example shows one possible outcome.

arrayPartialShuffle

Partially shuffles elements in an array randomly.

Syntax:

arrayPartialShuffle(arr, limit[, seed])

Arguments:

  • arr (Array): The input array to partially shuffle.
  • limit (integer): The number of elements to shuffle, starting from the beginning of the array.
  • seed (optional integer): Seed value for the random number generator. If not provided, a random seed is used.

Returns:

An array with the first limit elements shuffled randomly, and the remaining elements in their original order.

Example:

SELECT
arrayPartialShuffle(['beef', 'chicken', 'pork', 'fish', 'shrimp'], 3) AS shuffled_meats;

Result:

| shuffled_meats                            |
|-------------------------------------------|
| ['pork','beef','chicken','fish','shrimp'] |

In this example, the first 3 elements of the taco meat options array are shuffled randomly, while ‘fish’ and ‘shrimp’ remain in their original positions at the end of the array. The exact order of the shuffled elements may vary due to the random nature of the function.

The function will not materialize constants. The exact result may differ each time the query is run unless a seed value is provided.

arrayUniq

Returns the number of distinct elements in the array.

Syntax:

arrayUniq(arr)

Arguments:

  • arr - The input array.

Returns:

The number of distinct elements in the array.

Example:

SELECT
arrayUniq(['beef', 'chicken', 'beef', 'salsa', 'chicken']) AS unique_toppings;

Result:

| unique_toppings |
|-----------------|
| 3               |

In this example, arrayUniq counts the number of unique toppings in the taco order array. Even though ‘beef’ and ‘chicken’ appear twice, they are only counted once each, resulting in 3 unique toppings.

The function treats NULL values as distinct elements. The order of elements in the array does not affect the result.

arrayJoin

Converts an array into a set of rows.

Syntax:

arrayJoin(arr)

Arguments:

  • arr - Array to convert to rows.

Returns:

A set of rows, one for each element in the original array.

Example:

SELECT
  taco_id,
  arrayJoin(toppings) AS topping
FROM
  taco_orders;

Result:

| taco_id | topping    |
|---------|------------|
| 1       | cheese     |
| 1       | lettuce    |
| 1       | salsa      |
| 2       | beef       |
| 2       | guacamole  |
| 3       | chicken    |
| 3       | sour cream |

In this example, arrayJoin expands the toppings array into individual rows, allowing us to analyze each topping separately while maintaining the association with the original taco order.

arrayJoin can significantly increase the number of rows in the result set. Use it carefully with large arrays to avoid performance issues.

arrayDifference

Calculates the difference between adjacent elements in the array.

Syntax:

arrayDifference(arr)

Arguments:

  • arr (Array): The input array to calculate differences for.

Returns:

An array of the same size as the input, where each element is the difference between the current and previous element. The first element is always 0.

Example:

SELECT
  arrayDifference([1, 3, 5, 7, 9]) AS taco_price_differences;

Result:

| taco_price_differences |
|------------------------|
| [0,2,2,2,2]            |

In this example, we calculate the price differences between adjacent tacos. The first element is 0, and the subsequent elements show the price increase from the previous taco.

The function works with numeric arrays. For non-numeric arrays, it may produce unexpected results or errors.

arrayDistinct

Returns an array containing only the distinct elements from the input array.

Syntax:

arrayDistinct(`Array`)

Arguments:

  • array: The input array to remove duplicates from. [Array]

Returns

  • An array containing only the distinct elements from the input array. [Array]

Example:

SELECT
  arrayDistinct(['beef', 'chicken', 'beef', 'salsa', 'chicken']) AS unique_toppings;

Result:

| unique_toppings            |
|----------------------------|
| ['beef','chicken','salsa'] |

In this example, arrayDistinct removes the duplicate ‘beef’ and ‘chicken’ elements, returning an array with only the unique taco toppings.

arrayEnumerateDense

Returns an array of the same size as the source array, indicating where each element first appears in the source array.

Syntax:

arrayEnumerateDense(arr)

Arguments:

  • arr - The source array. [Array]

Returns

  • An array of the same size as the input, containing the position of each element’s first appearance. [Array]

Example:

SELECT
  arrayEnumerateDense(['salsa', 'guacamole', 'salsa', 'sour cream', 'guacamole']) AS taco_toppings_enumerated;

Result:

| taco_toppings_enumerated |
|--------------------------|
| [1,2,1,3,2]              |

In this example:

  • ‘salsa’ appears first, so it gets 1
  • ’guacamole’ appears second, so it gets 2
  • ’salsa’ appears again, but it already had 1, so it keeps 1
  • ’sour cream’ is new, so it gets the next number, 3
  • ’guacamole’ appears again, but keeps its original 2

The function is useful for finding unique elements in an array while preserving their order of appearance.

arrayEnumerateDenseRanked

Returns an array the same size as the source array, indicating where each element first appears in the source array. It allows for enumeration of a multidimensional array with the ability to specify how deep to look inside the array.

Syntax:

arrayEnumerateDenseRanked(clear_depth, arr, max_array_depth)

Arguments:

  • clear_depth (Integer): Enumerate elements at the specified level separately. Must be positive and less than or equal to max_array_depth.
  • arr (Array): N-dimensional array to enumerate.
  • max_array_depth (Integer): The maximum effective depth. Must be positive and less than or equal to the depth of arr.

Returns:

An array containing the enumeration results.

Example:

SELECT
  arrayEnumerateDenseRanked(1, [[1,2,3],[2,2,1],[3]], 2) AS res;

Result:

| res                     |
|-------------------------|
| [[1,1,1],[2,3,2],[2]]   |

In this example:

  • The first row [1,2,3] is enumerated as [1,1,1] since each element appears for the first time.
  • The second row [2,2,1] is enumerated as [2,3,2], where 2 is the second unique element, 2 is the third occurrence of 2, and 1 is the second unique element.
  • The third row [3] is enumerated as [2] since 3 is the second unique element overall.

When clear_depth=2, elements are enumerated separately for each row. For example, SELECT arrayEnumerateDenseRanked(2, [[1,2,3],[2,2,1],[3]], 2) would return [[1,1,1],[1,2,1],[1]].

arrayUnion

Combines multiple arrays and returns an array containing all unique elements from the input arrays.

Syntax:

arrayUnion(arr1, arr2, ...)

Arguments:

  • arr1, arr2, … - Arrays to combine. Can be any number of arrays of the same type.

Returns:

An array containing all unique elements from the input arrays.

Example:

SELECT
  arrayUnion(['beef', 'chicken'], ['chicken', 'fish'], ['pork']) AS taco_proteins;

Result:

| taco_proteins                       |
|-------------------------------------|
| ['beef', 'chicken', 'fish', 'pork'] |

In this example, arrayUnion combines the protein options for tacos from different arrays, removing duplicates (‘chicken’) to create a single array of unique taco protein choices.

arrayIntersect

Returns an array containing the intersection of elements from all input arrays.

Syntax:

arrayIntersect(arr1, arr2, ...)

Arguments:

  • arr1, arr2, … - Arrays to intersect. Must be at least two arrays.

Returns:

An array containing elements present in all input arrays.

Example:

SELECT
  arrayIntersect(['beef', 'cheese', 'lettuce'], ['cheese', 'salsa', 'lettuce'], ['cheese', 'lettuce', 'guacamole']) AS common_toppings;

Result:

| common_toppings           |
|---------------------------|
| ['cheese','lettuce']      |

In this example, arrayIntersect finds the common toppings across three different taco orders. The result shows that ‘cheese’ and ‘lettuce’ are the only toppings present in all three arrays.

  • The order of elements in the resulting array may not match the order in the input arrays.
  • If there are no common elements, an empty array is returned.
  • The function preserves duplicates in the result if they appear in all input arrays.

arrayJaccardIndex

Calculates the Jaccard index between two arrays.

Syntax:

arrayJaccardIndex(array1, array2)

Arguments:

  • array1 (Array): The first array to compare.
  • array2 (Array): The second array to compare.

Returns:

A numeric value representing the Jaccard index between the two arrays.

Example:

SELECT
  arrayJaccardIndex(
    ['cheese', 'salsa', 'guacamole'],
    ['cheese', 'sour cream', 'lettuce']
  ) AS taco_similarity;

Result:

| taco_similarity |
|-----------------|
| 0.2             |

In this example, the Jaccard index is calculated between two arrays of taco toppings. The result 0.2 indicates a relatively low similarity between the two topping combinations, as they only share one common element (‘cheese’) out of a total of 5 unique elements.

The Jaccard index ranges from 0 to 1, where 0 means the arrays have no elements in common, and 1 means the arrays are identical.

arrayReduce

Applies an aggregate function to array elements and returns its result.

Syntax:

arrayReduce(agg_func, arr1, arr2, ..., arrN)

Arguments:

  • agg_func - The name of an aggregate function as a string constant.
  • arr1, arr2, ..., arrN - Arrays to aggregate. Must have the same size.

Returns:

The result of applying the aggregate function to the array elements.

Example:

SELECT
  arrayReduce('sum', [1, 2, 3, 4, 5]) AS total_tacos

Result:

| total_tacos |
|-------------|
| 15          |

In this example, arrayReduce sums up the number of tacos in each order, resulting in a total of 15 tacos.

For parametric aggregate functions, specify the parameter in parentheses after the function name, e.g. ‘uniqUpTo(3)‘.

The function can be used with multiple array arguments if the aggregate function accepts multiple arguments.

arrayReduceInRanges

Applies an aggregate function to array elements in given ranges and returns an array containing the result corresponding to each range.

Syntax:

arrayReduceInRanges(agg_func, ranges, arr1, ...)

Arguments:

  • agg_func — The name of an aggregate function as a string.
  • ranges — An array of tuples containing the index and length of each range.
  • arr1, ... — One or more arrays to aggregate.

Returns

  • An array containing results of the aggregate function over specified ranges.

Example:

SELECT
  arrayReduceInRanges(
    'sum',
    [(1, 3), (2, 2)],
    [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
  ) AS taco_sums

Result:

| taco_sums    |
|--------------|
| [6, 5]       |

In this example:

  • The first range (1, 3) sums elements [2, 3, 4] resulting in 9
  • The second range (2, 2) sums elements [3, 4] resulting in 7
  • The result [9, 7] represents the sums for each specified range

This function is useful for calculating aggregates over specific portions of an array, such as summing taco sales for different time periods within a day.

arrayFold

Applies a lambda function to elements of one or more arrays and collects the result in an accumulator.

Syntax:

arrayFold(lambda_function, arr1, arr2, ..., accumulator)

Arguments:

  • lambda_function - Lambda function to apply to array elements. Should take the accumulator as the first argument, followed by array elements.
  • arr1, arr2, etc. - Input arrays to process.
  • accumulator - Initial value of the accumulator.

Returns

The final value of the accumulator after applying the lambda function to all array elements.

Example:

Calculate the sum of taco prices, doubling each price:

SELECT
  arrayFold((acc, x) -> acc + x*2, [5, 8, 10], 0) AS total_price

Result:

| total_price |
|-------------|
| 46          |

In this example:

  • The lambda function (acc, x) -> acc + x*2 is applied to each element of the array [5, 8, 10]
  • The accumulator starts at 0
  • For each taco price, it’s doubled and added to the accumulator
  • The final result is (52) + (82) + (10*2) = 46

arrayReverse

Reverses the order of elements in an array.

Syntax:

arrayReverse(arr)

Arguments:

  • arr - The input array to reverse.

Returns:

An array with the elements in reverse order.

Example:

SELECT
  arrayReverse(['beef', 'chicken', 'fish']) AS reversed_taco_meats;

Result:

| reversed_taco_meats            |
|--------------------------------|
| ['fish','chicken','beef']      |

In this example, arrayReverse is used to reverse the order of taco meat options. The original array ['beef', 'chicken', 'fish'] is reversed to ['fish', 'chicken', 'beef'].

reverse

Reverses the order of elements in an array.

Syntax:

reverse(arr)

Arguments:

  • arr - The array to reverse. [Array]

Returns

  • An array with the elements in reverse order. [Array]

Example:

SELECT
  reverse(['salsa', 'guacamole', 'sour cream']) AS reversed_toppings;

Result:

| reversed_toppings                  |
|------------------------------------|
| ['sour cream','guacamole','salsa'] |

In this example, the reverse function is applied to an array of taco toppings, reversing their order.

The reverse function is a synonym for arrayReverse.

arrayFlatten

Converts an array of arrays into a flat array.

Syntax:

arrayFlatten(array_of_arrays)

Arguments:

  • array_of_arrays (Array): The nested array to flatten.

Returns:

An array containing all the elements from all source arrays.

Example:

SELECT
  arrayFlatten([['beef', 'chicken'], ['lettuce', 'tomato'], ['cheese']]) AS flattened_toppings;

Result:

| flattened_toppings                                 |
|----------------------------------------------------|
| ['beef', 'chicken', 'lettuce', 'tomato', 'cheese'] |

In this example, arrayFlatten is used to combine multiple arrays of taco toppings into a single flat array containing all the toppings.

The function applies to any depth of nested arrays and does not change arrays that are already flat.

arrayCompact

Removes consecutive duplicate elements from an array. The order of result values is determined by the order in the source array.

Syntax:

arrayCompact(arr)

Arguments:

  • arr - The array to inspect. [Array]

Returns

The array without consecutive duplicates. [Array]

Example:

SELECT
  arrayCompact(['beef', 'beef', 'chicken', 'chicken', 'beef']) AS taco_fillings;

Result:

| taco_fillings             |
|---------------------------|
| ['beef','chicken','beef'] |

In this example, arrayCompact removes the consecutive duplicate ‘beef’ and ‘chicken’ elements from the array of taco fillings, while preserving the order and keeping the non-consecutive ‘beef’ at the end.

arrayZip

Combines multiple arrays into a single array of tuples. The resulting array contains the corresponding elements of the source arrays grouped into tuples in the order of the input arrays.

Syntax:

arrayZip(arr1, arr2, ..., arrN)

Arguments:

  • arr1, arr2, …, arrN - Arrays to be combined. All input arrays must be of equal size.

Returns:

An array of tuples containing elements from the source arrays grouped together.

Example:

SELECT
  arrayZip(['beef', 'chicken', 'fish'], ['salsa', 'guacamole', 'sour cream']) AS taco_combinations;

Result:

| taco_combinations                                                |
|------------------------------------------------------------------|
| [('beef','salsa'),('chicken','guacamole'),('fish','sour cream')] |

In this example, arrayZip combines the protein options with the sauce options for tacos, creating an array of tuples where each tuple represents a possible taco combination.

  • All input arrays must have the same number of elements.
  • The function can take any number of arrays of different types.
  • The resulting array will have the same number of elements as each input array.

arrayZipUnaligned

Combines multiple arrays into a single array, allowing for unaligned arrays. The resulting array contains the corresponding elements of the source arrays grouped into tuples in the listed order of arguments.

Syntax:

arrayZipUnaligned(arr1, arr2, ..., arrN)

Arguments:

  • arr1, arr2, …, arrN - Arrays of any type.

Returns

  • Array with elements from the source arrays grouped into tuples. Data types in the tuple are the same as types of the input arrays and in the same order as arrays are passed. If the arrays have different sizes, the shorter arrays will be padded with null values.

Example:

SELECT
  arrayZipUnaligned(['salsa', 'guacamole'], ['mild', 'spicy', 'extra spicy']) AS taco_toppings;

Result:

| taco_toppings                                                 |
|---------------------------------------------------------------|
| [('salsa','mild'),('guacamole','spicy'),(NULL,'extra spicy')] |

In this example, arrayZipUnaligned combines two arrays of taco toppings and their spice levels. Since the first array is shorter, it is padded with a NULL value to match the length of the second array.

arrayAUC

Calculates the Area Under the Curve (AUC) for an array of scores and labels.

Syntax:

arrayAUC(arr_scores, arr_labels[, scale])

Arguments:

  • arr_scores (array of numeric values): The array of prediction scores.
  • arr_labels (array of 0s and 1s): The array of true labels.
  • scale (optional boolean): Whether to return the normalized AUC. Default is true.

Returns:

A Float64 value representing the AUC.

Example:

SELECT
  arrayAUC([0.1, 0.4, 0.35, 0.8], [0, 0, 1, 1]) AS auc_score;

Result:

| auc_score |
|-----------|
| 0.75      |

In this example, we calculate the AUC for a taco rating prediction model. The arr_scores array contains the model’s predicted scores for 4 tacos, and the arr_labels array contains the true ratings (0 for disliked, 1 for liked). The resulting AUC of 0.75 indicates that the model has good predictive performance in distinguishing between liked and disliked tacos.

arrayMap

Applies a function to each element of an array and returns an array of results.

Syntax:

arrayMap(func, arr1, ...)

Arguments:

  • func (function): The function to apply to each element. Can be a lambda function.
  • arr1 (Array): The input array to map over.
  • ... (optional): Additional arrays if the function takes multiple arguments.

Returns:

An array containing the results of applying the function to each element.

Example:

SELECT
  arrayMap(x -> x * 2, [1, 2, 3]) AS doubled_taco_prices,
  arrayMap((x, y) -> x || ' with ' || y,
            ['beef', 'chicken', 'fish'],
                ['salsa', 'guacamole', 'sour cream']) AS taco_descriptions

Result:

| doubled_taco_prices | taco_descriptions                                                     |
|---------------------|-----------------------------------------------------------------------|
| [2, 4, 6]           | ['beef with salsa', 'chicken with guacamole', 'fish with sour cream'] |

In this example:

  • doubled_taco_prices doubles each element in the input array.
  • taco_descriptions combines elements from two arrays to create taco descriptions.

The arrayMap function is a higher-order function. You must pass a lambda function to it as the first argument.

arrayFilter

Filters elements in an array based on a specified condition.

Syntax:

arrayFilter(func, arr1, ...)

Arguments:

  • func (lambda function): The condition to apply to each element.
  • arr1 (Array): The array to filter.
  • ... (optional arrays): Additional arrays to use in the lambda function.

Returns:

An array containing only the elements that satisfy the condition.

Example:

SELECT
  arrayFilter(x -> x LIKE '%cheese%', ['beef taco', 'cheese taco', 'chicken taco']) AS cheese_tacos;

Result:

| cheese_tacos    |
|-----------------|
| ['cheese taco'] |

In this example, arrayFilter returns an array containing only the taco types that include ‘cheese’.

The arrayFilter function is a higher-order function. You must pass a lambda function to it as the first argument, and it can’t be omitted.

arrayFill

Replaces elements in an array with a specified value based on a condition.

Syntax:

arrayFill(func, arr1, ...)

Arguments:

  • func: A lambda function that defines the condition for filling elements.
  • arr1: The input array to be modified.
  • ...: Additional arrays if the lambda function takes multiple arguments.

Returns:

An array with elements replaced according to the condition.

Example:

SELECT
  arrayFill(x -> x = 'plain', ['beef', 'plain', 'chicken', 'plain', 'veggie']) AS filled_tacos

Result:

| filled_tacos                                 |
|----------------------------------------------|
| ['beef', 'plain', 'plain', 'plain', 'plain'] |

In this example, the arrayFill function replaces all elements after and including the first ‘plain’ taco with ‘plain’. The lambda function x -> x = 'plain' checks if each element is ‘plain’, and once true, all subsequent elements are filled with ‘plain’.

The arrayFill function is a higher-order function. You must pass a lambda function to it as the first argument, and it can’t be omitted.

arrayReverseFill

Scans through arr1 from the last element to the first element and replaces arr1[i] by arr1[i + 1] if func(arr1[i], ..., arrN[i]) returns 0. The last element of arr1 will not be replaced.

Syntax:

arrayReverseFill(func, arr1, ...)

Arguments:

  • func (lambda function): The function to apply to each element.
  • arr1 (Array): The array to modify.
  • ... (optional): Additional arrays to pass as arguments to func.

Returns:

An array with elements replaced according to the function results.

Example:

SELECT
  arrayReverseFill(x -> x != 'cheese',
                   ['salsa', 'cheese', 'cheese', 'guacamole', 'cheese']) AS taco_toppings

Result:

| taco_toppings                                              |
|------------------------------------------------------------|
| ['salsa', 'guacamole', 'guacamole', 'guacamole', 'cheese'] |

In this example, the function replaces ‘cheese’ toppings with the next non-cheese topping, working from right to left. The rightmost ‘cheese’ remains unchanged.

The arrayReverseFill function is a higher-order function. You must pass a lambda function to it as the first argument, and it can’t be omitted.

arraySplit

Splits an array into multiple arrays based on a condition.

Syntax:

arraySplit(func, arr1, ...)

Arguments:

  • func (function): A function that returns a boolean value indicating where to split the array.
  • arr1 (Array): The array to split.
  • ... (optional): Additional arrays to pass as arguments to func.

Returns:

An array of arrays containing the split segments of the original array.

Example:

SELECT
  arraySplit((x, y) -> y,
    ['beef', 'chicken', 'pork', 'fish', 'tofu'],
    [0, 1, 0, 1, 0]
  ) AS taco_menu_sections

Result:

| taco_menu_sections                                |
|---------------------------------------------------|
| [['beef', 'pork', 'tofu'], ['chicken', 'fish']]   |

In this example, the taco menu is split into meat and non-meat sections based on the boolean array. The 1 values indicate where to split, resulting in two sub-arrays.

The array will not be split before the first element. The function is a higher-order function, so a lambda function must be passed as the first argument.

arrayReverseSplit

Splits an array into multiple arrays based on a condition, starting from the end of the array. When the condition function returns a non-zero value, the array is split on the right side of that element.

Syntax:

arrayReverseSplit(func, arr1, ...)

Arguments:

  • func (function): A condition function that takes array elements as arguments and returns a boolean value.
  • arr1 (Array): The array to split.
  • ... (optional): Additional arrays to pass as arguments to the condition function.

Returns:

An array of arrays containing the split segments.

Example:

SELECT
  arrayReverseSplit((x, y) -> y,
    ['beef', 'chicken', 'pork', 'fish', 'tofu'],
    [0, 0, 1, 0, 1]
  ) AS taco_fillings;

Result:

| taco_fillings                                     |
|---------------------------------------------------|
| [['beef', 'chicken'], ['pork'], ['fish', 'tofu']] |

In this example:

  • The array is split based on the second array [0, 0, 1, 0, 1].
  • Starting from the end, the array is split whenever a 1 is encountered.
  • The result is an array of three sub-arrays representing different taco filling options.

The arrayReverseSplit function is a higher-order function. You must pass a lambda function to it as the first argument, and it can’t be omitted.

arrayExists

Checks whether the array contains at least one element that satisfies the given predicate.

Syntax:

arrayExists([func,] arr)

Arguments:

  • func (optional): A lambda function that defines the predicate. If not specified, the function checks for non-zero elements.
  • arr: The array to check.

Returns:

Returns 1 if at least one element satisfies the predicate, otherwise 0.

Example:

SELECT
  arrayExists(x -> x > 3, [1, 2, 3, 4, 5]) AS has_taco_over_3,
  arrayExists(x -> x = 'salsa', ['cheese', 'lettuce', 'salsa']) AS has_salsa

Result:

| has_taco_over_3 | has_salsa |
|-----------------|-----------|
| 1               | 1         |

In this example:

  • has_taco_over_3 checks if there’s a taco with more than 3 ingredients.
  • has_salsa checks if ‘salsa’ is one of the taco toppings.

Note that arrayExists is a higher-order function. You can pass a lambda function to it as the first argument to define custom predicates.

arrayAll

Checks whether all elements in the array satisfy the given predicate.

Syntax:

arrayAll([func,] arr)

Arguments:

  • func - Optional predicate function. If not specified, checks if all elements are non-zero.
  • arr - Array to check.

Returns:

Returns 1 if all elements in the array satisfy the predicate, 0 otherwise.

Example:

SELECT
  arrayAll(x -> x > 0, [1, 2, 3]) AS all_positive,
  arrayAll(x -> x > 0, [1, -2, 3]) AS not_all_positive,
  arrayAll(x -> x != 'plain', ['spicy', 'bbq', 'garlic']) AS no_plain_tacos

Result:

| all_positive | not_all_positive | no_plain_tacos |
|--------------|------------------|----------------|
| 1            | 0                | 1              |

In this example:

  • all_positive returns 1 because all elements in [1, 2, 3] are greater than 0.
  • not_all_positive returns 0 because -2 is not greater than 0.
  • no_plain_tacos returns 1 because none of the taco flavors are ‘plain’.

arrayAll is a higher-order function, so you can pass a lambda function as the predicate.

arrayFirst

Returns the first element in the array for which the given function returns a non-zero value.

Syntax:

arrayFirst(func, arr)

Arguments:

  • func (function): The function to apply to each element.
  • arr (Array): The input array to search.

Returns:

The first element for which func returns non-zero, or NULL if no such element is found.

Example:

SELECT
  arrayFirst(x -> x > 2, [1, 2, 3, 4, 5]) AS first_over_2,
  arrayFirst(x -> x LIKE '%cheese%', ['salsa taco', 'cheese taco', 'beef taco']) AS first_cheese_taco

Result:

| first_over_2 | first_cheese_taco |
|--------------|-------------------|
| 3            | cheese taco       |

In this example:

  • first_over_2 returns 3, the first element greater than 2.
  • first_cheese_taco returns ‘cheese taco’, the first element containing ‘cheese’.

The function stops processing after finding the first matching element, which can improve performance on large arrays.

arrayFirstOrNull

Returns the first element in the array for which the given function returns a non-zero value, or NULL if there is no such element.

Syntax:

arrayFirstOrNull(func, arr1, ...)

Arguments:

  • func (function): The function to apply to each element of the array.
  • arr1 (Array): The array to search.
  • ... (optional): Additional arrays if the function takes multiple arguments.

Returns:

The first matching element, or NULL if no match is found.

Example:

SELECT
  arrayFirstOrNull(x -> x > 2, [1, 2, 3, 4, 5]) AS first_over_two,
  arrayFirstOrNull(x -> x = 'salsa', ['cheese', 'lettuce', 'salsa', 'guacamole']) AS first_salsa,
  arrayFirstOrNull(x -> x = 'jalapeno', ['cheese', 'lettuce', 'salsa', 'guacamole']) AS no_jalapeno
FROM
  taco_toppings;

Result:

| first_over_two | first_salsa | no_jalapeno |
|----------------|-------------|-------------|
| 3              | salsa       | NULL        |

In this example:

  • first_over_two returns 3, the first element greater than 2.
  • first_salsa returns ‘salsa’, the first element matching ‘salsa’.
  • no_jalapeno returns NULL because no element matches ‘jalapeno’.

The function returns NULL for empty arrays or if no element satisfies the condition.

arrayLast

Returns the last element in the array for which the given function returns a non-zero value.

Syntax:

arrayLast(func, arr)

Arguments:

  • func (function): The function to apply to each element. Must return a non-zero value for the desired element.
  • arr (Array): The input array to search.

Returns:

The last element in the array for which func returns non-zero, or NULL if no such element exists.

Example:

SELECT
  arrayLast(x -> x LIKE '%cheese%', ['salsa taco', 'bean taco', 'cheese taco', 'supreme taco']) AS last_cheese_taco;

Result:

| last_cheese_taco  |
|-------------------|
| cheese taco       |

In this example, arrayLast finds the last taco name containing “cheese” in the array of taco names.

The arrayLast function is a higher-order function. The lambda function passed as the first argument cannot be omitted.

arrayLastOrNull

Returns the last element in the array for which the specified function returns a non-zero value, or NULL if no such element is found.

Syntax:

arrayLastOrNull(func, arr1)

Arguments:

  • func (lambda function): The function to apply to each element of the array.
  • arr1 (Array): The array to search.

Returns:

The last matching element, or NULL if no match is found.

Example:

SELECT
  arrayLastOrNull(x -> x LIKE '%cheese%', ['salsa taco', 'cheese taco', 'beef taco', 'extra cheese taco']) AS last_cheese_taco
FROM
  taco_orders;

Result:

| last_cheese_taco   |
|--------------------|
| extra cheese taco  |

In this example, arrayLastOrNull finds the last taco name containing “cheese” in the array of taco names. It returns “extra cheese taco” as that is the last element matching the condition.

The function returns NULL if no element matches the condition or if the input array is empty.

arrayFirstIndex

Returns the index of the first element in the arr1 array for which func(arr1[i], ..., arrN[i]) returns something other than 0.

Syntax:

arrayFirstIndex([func,] arr1, ...)

Arguments:

  • func (lambda function): The function to apply to each element. Optional.
  • arr1 (Array): The first input array.
  • ... (arrays): Additional input arrays. Optional.

Returns:

The index of the first matching element, or 0 if no match is found.

Example:

SELECT
  arrayFirstIndex(x -> x LIKE '%cheese%', ['salsa', 'sour cream', 'cheese', 'guacamole']) AS cheese_index

Result:

| cheese_index |
|--------------|
| 3            |

In this example, arrayFirstIndex finds the index of the first taco topping containing “cheese”. The lambda function x -> x LIKE '%cheese%' checks each element, and the function returns 3 since “cheese” is the third element in the array.

The arrayFirstIndex function is a higher-order function. You must pass a lambda function to it as the first argument, and it can’t be omitted.

arrayLastIndex

Returns the index of the last element in the array for which the given function returns a non-zero value.

Syntax:

arrayLastIndex(func, arr)

Arguments:

  • func (function): The function to apply to each element. Should return a non-zero value for the desired elements.
  • arr (Array): The input array to search.

Returns:

The index of the last matching element, or 0 if no match is found.

Syntax:

arrayLastIndex(func, arr)

Example:

SELECT
  arrayLastIndex(x -> x LIKE '%cheese%', ['salsa taco', 'cheese taco', 'beef taco', 'extra cheese taco']) AS last_cheese_index;

Result:

| last_cheese_index  |
|--------------------|
| 4                  |

In this example, the function finds the index of the last taco that contains ‘cheese’ in its name. The result is 4, corresponding to ‘extra cheese taco’.

The index is 1-based, meaning the first element has an index of 1.

arrayMin

Returns the minimum element in the source array.

Syntax:

arrayMin([func,] arr)

Arguments:

  • func (optional): Function to apply to array elements before finding the minimum.
  • arr: The input array.

Returns:

The minimum element in the array (or minimum of function values if func is specified).

Example:

SELECT
  arrayMin(['beef', 'chicken', 'fish']) AS min_taco_protein;

Result:

| min_taco_protein  |
|-------------------|
| beef              |

In this example, arrayMin returns the lexicographically smallest taco protein from the array.

Example with function:

SELECT
  arrayMin(x -> length(x), ['beef', 'chicken', 'fish']) AS min_protein_length;

Result:

| min_protein_length |
|--------------------|
| 4                  |

Here, arrayMin applies the length function to each element and returns the minimum length, which is 4 for ‘beef’ and ‘fish’.

arrayMax

Returns the maximum element in the source array.

Syntax:

arrayMax([func,] arr)

Arguments:

  • func (optional): Function to apply to array elements before finding the maximum.
  • arr: The input array.

Returns:

The maximum element in the array (or maximum of function values if func is specified).

Example:

SELECT
  arrayMax(['beef', 'chicken', 'fish']) AS max_taco_protein;

Result:

| max_taco_protein  |
|-------------------|
| fish              |

In this example, arrayMax returns ‘fish’ as the lexicographically maximum taco protein from the array.

Example with function:

SELECT
  arrayMax(x -> length(x), ['beef', 'chicken', 'fish']) AS longest_protein_name;

Result:

| longest_protein_name |
|----------------------|
| 7                    |

Here, arrayMax applies the length function to each element and returns the maximum length, which is 7 for ‘chicken’.

arraySum

Calculates the sum of elements in an array.

Syntax:

arraySum([func,] arr)

Arguments:

  • func (optional): A function to apply to each element before summing.
  • arr: The input array.

Returns:

The sum of the array elements (or function results if func is specified).

Example:

SELECT
  arraySum([1, 2, 3]) AS sum_tacos,
  arraySum(x -> x * 2, [1, 2, 3]) AS sum_double_tacos

Result:

| sum_tacos | sum_double_tacos |
|-----------|------------------|
| 6         | 12               |

In this example:

  • sum_tacos calculates the sum of taco quantities: 1 + 2 + 3 = 6
  • sum_double_tacos doubles each taco quantity before summing: (12) + (22) + (3*2) = 12

The return type depends on the input:

  • For decimal numbers: Decimal128
  • For floating point numbers: Float64
  • For unsigned integers: UInt64
  • For signed integers: Int64

The arraySum function can be useful for calculating totals across array elements, such as summing up taco quantities or prices in an order.

arrayAvg

Calculates the average of elements in the source array.

Syntax:

arrayAvg(arr)

Arguments:

  • arr (Array): The input array to calculate the average from.
  • func (optional): A function to apply to each element before calculating the average.

Returns:

The average of the array elements as a Float64 value.

Or with an optional function:

arrayAvg(func, arr)

Example:

SELECT
  arrayAvg([1, 2, 3, 4, 5]) AS avg_toppings,
  arrayAvg(x -> x * 2, [1, 2, 3, 4, 5]) AS avg_double_toppings
FROM
  taco_orders;

Result:

| avg_toppings | avg_double_toppings |
|--------------|---------------------|
| 3            | 6                   |

In this example:

  • avg_toppings calculates the average number of toppings per taco order.
  • avg_double_toppings doubles each topping count before calculating the average.

The arrayAvg function always returns a Float64 value, even if the input array contains integers.

arrayCumSum

Calculates the cumulative sum of elements in an array.

Syntax:

arrayCumSum(arr)

Arguments:

  • arr - The input array of numeric values.

Returns:

An array containing the cumulative sums of the input array elements.

Example:

SELECT
  arrayCumSum([1, 2, 3, 4, 5]) AS taco_sales_cumsum;

Result:

| taco_sales_cumsum |
|-------------------|
| [1, 3, 6, 10, 15] |

In this example, arrayCumSum calculates the cumulative sum of daily taco sales. The result shows the total number of tacos sold up to each day.

The function can also accept a lambda function as the first argument to transform the array elements before calculating the cumulative sum.

arrayCumSumNonNegative

Calculates the cumulative sum of elements in an array, replacing any negative sums with 0.

Syntax:

arrayCumSumNonNegative(arr)

Arguments:

  • arr - Array of numeric values.

Returns:

An array of the same size as the input, containing the non-negative cumulative sums.

Example:

SELECT
  arrayCumSumNonNegative([1, -2, 3, -4, 5]) AS result;

Result:

| result          |
|-----------------|
| [1, 0, 3, 0, 5] |

In this example:

  • The cumulative sum starts at 1
  • It becomes 0 when adding -2 (1 + (-2) = -1, replaced with 0)
  • Then 3 is added to make 3
  • It becomes 0 again when adding -4 (3 + (-4) = -1, replaced with 0)
  • Finally 5 is added to make 5

This function is useful for calculations where negative intermediate results should be ignored, such as tracking a running total that cannot go below zero.

arrayProduct

Multiplies elements of an array.

Syntax:

arrayProduct(arr)

Arguments:

  • arr (Array): The array of numeric values to multiply.

Returns:

A product of the array’s elements as a Float64 value.

Example:

SELECT
  arrayProduct([2, 3, 4]) AS taco_combo_multiplier;

Result:

| taco_combo_multiplier |
|-----------------------|
| 24                    |

In this example, arrayProduct multiplies the values 2, 3, and 4, which could represent multipliers for different taco combo options, resulting in a total multiplier of 24.

arrayRotateLeft

Rotates an array to the left by the specified number of elements.

Syntax:

arrayRotateLeft(arr, n)

Arguments:

  • arr (Array): The input array to rotate.
  • n (integer): Number of positions to rotate left. If negative, rotates right.

Returns:

An array with elements rotated to the left by n positions.

Example:

SELECT
  arrayRotateLeft(['beef', 'chicken', 'pork', 'fish', 'veggie'], 2) AS rotated_tacos;

Result:

| rotated_tacos                                 |
|-----------------------------------------------|
| ['pork', 'fish', 'veggie', 'beef', 'chicken'] |

In this example, the array of taco fillings is rotated 2 positions to the left, so ‘pork’ becomes the first element and ‘beef’ and ‘chicken’ move to the end.

If n is negative, the array rotates to the right instead. For example, arrayRotateLeft(arr, -1) is equivalent to arrayRotateRight(arr, 1).

arrayRotateRight

Rotates an array to the right by the specified number of elements. If the number of elements is negative, the array is rotated to the left.

Syntax:

arrayRotateRight(arr, n)

Arguments:

  • arr (Array): The array to rotate.
  • n (integer): Number of elements to rotate.

Returns:

An array rotated to the right by the specified number of elements.

Example:

SELECT
  arrayRotateRight(['beef', 'chicken', 'pork', 'fish', 'veggie'], 2) AS rotated_tacos;

Result:

| rotated_tacos                                 |
|-----------------------------------------------|
| ['fish', 'veggie', 'beef', 'chicken', 'pork'] |

In this example, the array of taco fillings is rotated 2 positions to the right. ‘fish’ and ‘veggie’ move to the beginning of the array, while ‘beef’, ‘chicken’, and ‘pork’ shift to the end.

arrayShiftLeft

Shifts an array to the left by the specified number of elements. New elements are filled with the provided argument or the default value of the array element type. If the number of elements is negative, the array is shifted to the right.

Syntax:

arrayShiftLeft(arr, n[, default])

Arguments:

  • arr (Array): The input array to be shifted.
  • n (integer): Number of elements to shift.
  • default (optional): Default value for new elements. If not provided, uses the default value for the array element type.

Returns:

An array shifted to the left by the specified number of elements.

Example:

SELECT
  arrayShiftLeft(['beef', 'chicken', 'pork', 'fish', 'veggie'], 2, 'plain') AS shifted_tacos;

Result:

| shifted_tacos                                |
|----------------------------------------------|
| ['pork', 'fish', 'veggie', 'plain', 'plain'] |

In this example, the taco array is shifted 2 positions to the left. The first two elements are removed, and two new ‘plain’ elements are added at the end.

arrayShiftRight

Shifts an array to the right by the specified number of elements. New elements are filled with the provided argument or the default value of the array element type. If the number of elements is negative, the array is shifted to the left.

Arguments:

  • arr (Array): The input array to be shifted.
  • n (integer): Number of elements to shift.
  • default (optional): Default value for new elements. If not provided, uses the default value for the array element type.

Returns:

An array shifted to the right by the specified number of elements.

Syntax:

arrayShiftRight(arr, n[, default])

Example:

SELECT
  arrayShiftRight(['beef', 'chicken', 'pork', 'fish'], 2, 'veggie') AS shifted_tacos,
  arrayShiftRight(['mild', 'medium', 'hot'], -1) AS shifted_sauces

Result:

| shifted_tacos                        | shifted_sauces      |
|--------------------------------------|---------------------|
| ['veggie','veggie','beef','chicken'] | ['medium','hot',''] |

In this example:

  • For shifted_tacos, the array is shifted 2 positions to the right, with ‘veggie’ filling the new elements.
  • For shifted_sauces, the array is shifted 1 position to the left (due to negative n), with an empty string as the default value.

arrayRandomSample

Returns a subset with a specified number of random elements from the input array.

Syntax:

arrayRandomSample(arr, samples)

Arguments:

  • arr (Array): The input array to sample from.
  • samples (UInt*): The number of elements to include in the random sample.

Returns:

An array containing a random sample of elements from the input array.

Example:

SELECT
  arrayRandomSample(['beef', 'chicken', 'pork', 'fish', 'veggie'], 3) AS random_taco_fillings;

Result:

| random_taco_fillings      |
|---------------------------|
| ['pork','veggie','beef']  |

In this example, arrayRandomSample selects 3 random taco fillings from the input array. The order of elements in the result is random.

If the number of samples exceeds the size of the input array, the sample size is limited to the size of the array, i.e. all array elements are returned but their order is not guaranteed.