ClickHouse function reference

any

Selects the first encountered non-NULL value from a column.

Syntax:

any(column) [RESPECT NULLS]

Aliases:

  • any_value
  • first_value

Arguments:

  • column: The column name.

Modifiers:

  • RESPECT NULLS: Includes NULL values in the selection.

Returns:

The first non-NULL value encountered in the column. The return type is the same as the input type, except for LowCardinality which is discarded.

  • The RESPECT NULLS modifier can be used after the function name to include NULL values in the selection.
  • If no rows are input, it returns the default value for the column type (0 for integers, NULL for Nullable columns).
  • The order of execution is not guaranteed, so results may be indeterminate.

Example:

SELECT
    any(taco_type) AS first_taco
FROM (
    SELECT 1 AS order_id, NULL AS taco_type
    UNION ALL
    SELECT 2 AS order_id, 'Carnitas' AS taco_type
    UNION ALL
    SELECT 3 AS order_id, 'Al Pastor' AS taco_type
    UNION ALL
    SELECT 4 AS order_id, 'Pescado' AS taco_type
    UNION ALL
    SELECT 5 AS order_id, NULL AS taco_type
) AS taco_orders;

Result:

| first_taco |
|------------|
| Carnitas   |

In this example, any(taco_type) returns ‘Carnitas’, the first non-NULL value encountered in the taco_type column.

  • For deterministic results, consider using min or max functions instead of any.
  • In some cases, you can rely on the order of execution, such as when SELECT comes from a subquery that uses ORDER BY.
  • When a SELECT query has a GROUP BY clause or at least one aggregate function, all expressions in SELECT, HAVING, and ORDER BY clauses must be calculated from keys or aggregate functions.

anyHeavy

Selects a frequently occurring value using the heavy hitters algorithm. If there is a value that occurs more than in half the cases in each of the query’s execution threads, this value is returned. The result is typically nondeterministic.

Syntax:

anyHeavy(column)

Arguments:

  • column (any): The column name.

Returns:

A value from the specified column that occurs frequently.

Example:

SELECT
	anyHeavy(taco_type) AS popular_taco
FROM
	(
		SELECT 'Carnitas' AS taco_type
		UNION ALL
		SELECT 'Carnitas' AS taco_type
		UNION ALL
		SELECT 'Al Pastor' AS taco_type
		UNION ALL
		SELECT 'Carnitas' AS taco_type
		UNION ALL
		SELECT 'Pescado' AS taco_type
		UNION ALL
		SELECT 'Carnitas' AS taco_type
	) AS taco_orders;

Result:

| popular_taco |
|--------------|
| Carnitas     |

In this example, anyHeavy selects a frequently occurring taco type from the taco_orders table. The result shows that “Carnitas” is a popular taco choice, appearing in more than half of the orders in at least one execution thread.

The function is useful for identifying dominant values in a dataset, but it doesn’t guarantee returning the most frequent value overall. It’s particularly efficient for datasets with a clear majority value.

anyLast

Selects the last non-NULL value encountered in a column. If all values are NULL, it returns NULL. The result is indeterminate and may vary between query executions.

Syntax:

anyLast(column) [RESPECT NULLS]

Arguments:

  • column (any data type): The column to analyze.

Returns:

The last non-NULL value in the column, or NULL if all values are NULL.

Modifier:

  • RESPECT NULLS: When specified, the function will return the last value, even if it’s NULL.

Example:

SELECT
	anyLast(taco_filling) AS last_filling
FROM
	(
		SELECT 'Carnitas' AS taco_filling
		UNION ALL
		SELECT 'Al Pastor' AS taco_filling
		UNION ALL
		SELECT 'Pescado' AS taco_filling
		UNION ALL
		SELECT 'Pollo' AS taco_filling
		UNION ALL
		SELECT 'Carnitas' AS taco_filling
	) AS taco_orders;

Result:

| last_filling |
|--------------|
| Carnitas     |

In this example, anyLast returns ‘Carnitas’ as the last non-NULL taco filling from the orders.

The function’s behavior is non-deterministic. If you need consistent results, consider using ORDER BY and LIMIT 1 instead.

argMax

Calculates the arg value corresponding to the maximum val value. If multiple rows have the same maximum val, the returned arg is not deterministic.

Syntax:

argMax(arg, val)

Arguments:

  • arg (any type): The argument to return.
  • val (any type): The value to compare for finding the maximum.

Returns:

The arg value corresponding to the maximum val. The return type matches the type of arg.

  • Both arg and val behave as aggregate functions.
  • NULL values are skipped during processing.
  • If all values are NULL, the function returns NULL.

Example:

SELECT
	argMax(taco_type, price) AS most_expensive_taco
FROM
(
    SELECT 'Beef' AS taco_type, 8.99 AS price
    UNION ALL
    SELECT 'Chicken' AS taco_type, 6.99 AS price
    UNION ALL
    SELECT 'Lobster Taco' AS taco_type, 15.99 AS price
    UNION ALL
    SELECT 'Vegetarian' AS taco_type, 5.99 AS price
) AS taco_menu;

Result:

| most_expensive_taco |
|---------------------|
| Lobster Taco        |

In this example, argMax returns the taco_type corresponding to the highest price in the taco_menu.

argMin

Calculates the arg value corresponding to the minimum val value. If multiple rows have the same minimum val, the returned arg is not deterministic.

Syntax:

argMin(arg, val)

Arguments:

  • arg (any type): The argument to return.
  • val (any type): The value to compare for finding the minimum.

Returns:

The arg value that corresponds to the minimum val. The return type matches the type of arg.

  • Both arg and val behave as aggregate functions.
  • NULL values are skipped during processing.
  • If all values are NULL, the function returns NULL.

Example:

SELECT
	argMin(taco_type, price) AS cheapest_taco
FROM
(
    SELECT 'Beef' AS taco_type, 8.99 AS price
    UNION ALL
    SELECT 'Chicken' AS taco_type, 6.99 AS price
    UNION ALL
    SELECT 'Lobster Taco' AS taco_type, 15.99 AS price
    UNION ALL
    SELECT 'Vegetarian' AS taco_type, 5.99 AS price
) AS taco_menu;

Result:

| cheapest_taco |
|---------------|
| Vegetarian    |

In this example, argMin returns the taco_type corresponding to the lowest price in the taco_menu.

avg

Calculates the arithmetic mean of a set of values.

Syntax:

avg(x)

Arguments:

  • x (numeric): The values to calculate the average of.

Returns:

The arithmetic mean as a Float64.

  • Returns NaN if the input is empty.

Example:

SELECT
	avg(taco_price) AS average_price
FROM
	(
		SELECT 6.50 AS taco_price
		UNION ALL
		SELECT 7.00 AS taco_price
		UNION ALL
		SELECT 8.25 AS taco_price
	) AS taco_menu;

Result:

| average_price |
|---------------|
| 7.25          |

In this example, we calculate the average price of tacos from the taco_menu table.

Example with empty input:

-- Example with empty input
SELECT
	avg(sales_amount) AS average_sales
FROM
	(
		SELECT sales_amount
		FROM (
			SELECT 1 AS sales_amount
			WHERE 0 = 1
		)
	) AS empty_taco_sales;

Result:

| avg(sales_amount) |
|-------------------|
| nan               |

This example demonstrates that avg() returns NaN when given an empty input.

median

  • median — Alias for quantile.
  • medianDeterministic — Alias for quantileDeterministic.
  • medianExact — Alias for quantileExact.
  • medianExactWeighted — Alias for quantileExactWeighted.
  • medianTiming — Alias for quantileTiming.
  • medianTimingWeighted — Alias for quantileTimingWeighted.
  • medianTDigest — Alias for quantileTDigest.
  • medianTDigestWeighted — Alias for quantileTDigestWeighted.
  • medianBFloat16 — Alias for quantileBFloat16.
  • medianDD — Alias for quantileDD.

count

Counts the number of rows or non-NULL values.

Syntax:

count(expr)
count(DISTINCT expr)
count()
count(*)

Arguments:

  • expr (optional): An expression of any type.

Returns:

The number of rows or non-NULL values. (UInt64)

Details:

  1. count() or count(*): Counts the total number of rows.
  2. count(expr): Counts the number of rows where expr is not NULL.
  3. count(DISTINCT expr): Counts the number of distinct non-NULL values of expr.

The implementation of COUNT(DISTINCT ...) uses uniqExact.

SELECT count() FROM table is optimized using MergeTree metadata by default.

Examples:

Count total rows:

SELECT
	count() AS total_orders
FROM
	(
		SELECT 'Carnitas' AS taco_type
		UNION ALL
		SELECT 'Al Pastor' AS taco_type
		UNION ALL
		SELECT 'Pescado' AS taco_type
		UNION ALL
		SELECT 'Pollo' AS taco_type
		UNION ALL
		SELECT 'Carnitas' AS taco_type
	) AS taco_orders;

Result:

| count() |
|---------|
| 5       |

Count distinct taco types:

SELECT
	count(DISTINCT taco_type) AS distinct_taco_types
FROM
	(
		SELECT 'Carnitas' AS taco_type
		UNION ALL
		SELECT 'Al Pastor' AS taco_type
		UNION ALL
		SELECT 'Pescado' AS taco_type
		UNION ALL
		SELECT 'Pollo' AS taco_type
		UNION ALL
		SELECT 'Carnitas' AS taco_type
	) AS taco_orders;

Result:

| uniqExact(taco_type) |
|----------------------|
| 4                    |

For improved performance on COUNT(DISTINCT expr) queries, consider adding a GROUP BY clause or using a projection to create an index on the target column.

first_value

Returns the first value in a group. It is an alias for any but was introduced for compatibility with window functions.

Syntax:

first_value(expr)
first_value(expr) [RESPECT | IGNORE] NULLS

Arguments:

  • expr: The expression to evaluate.

Modifiers:

  • RESPECT NULLS: Includes NULL values in the selection.
  • IGNORE NULLS: Excludes NULL values from the selection.

Returns:

The first value in the group, matching the type of expr.

  • By default, first_value ignores NULL values.
  • Without window functions, the result may be random if the source stream is not ordered.

Example:

-- Create a sample table
-- Self-contained example using first_value function
SELECT
    first_value(taco_type) AS first_taco,
    first_value(taco_type) RESPECT NULLS AS first_taco_with_null,
    first_value(taco_type) IGNORE NULLS AS first_taco_ignore_null
FROM (
     SELECT 1 AS order_id, NULL AS taco_type, 1 AS quantity
    UNION ALL
    SELECT 2 AS order_id, 'Carnitas' AS taco_type, 2 AS quantity
    UNION ALL
    SELECT 3 AS order_id, 'Al Pastor' AS taco_type, 3 AS quantity
    UNION ALL
    SELECT 4 AS order_id, 'Pescado' AS taco_type, 2 AS quantity
    UNION ALL
    SELECT 5 AS order_id, NULL AS taco_type, 1 AS quantity
) AS taco_orders;

Result:

| first_taco | first_taco_with_null | first_taco_ignore_null |
|------------|----------------------|------------------------|
| Carnitas   | NULL                 | Carnitas               |

In this example:

  • first_taco and first_taco_ignore_null both return ‘Carnitas’, ignoring NULL values.
  • first_taco_with_null returns NULL, as it’s the first value when considering NULL values.

last_value

Selects the last encountered value in a set of values. It can handle NULL values and is particularly useful with Window Functions.

Syntax:

last_value(expr)
last_value(expr) [RESPECT | IGNORE] NULLS

Arguments:

  • expr: The expression to evaluate.

Modifiers:

  • RESPECT NULLS: Includes NULL values in the selection.
  • IGNORE NULLS: Excludes NULL values from the selection.

Returns:

The last value of the expression, based on the order of rows processed.

  • Without Window Functions, the result may be unpredictable if the source stream is not ordered.
  • By default, NULL values are ignored.

Example:

-- Self-contained example using last_value function
SELECT
    last_value(taco_type) AS last_taco_default,
    last_value(taco_type) IGNORE NULLS AS last_taco_ignore_null,
    last_value(taco_type) RESPECT NULLS AS last_taco_respect_null
FROM (
    SELECT 1 AS order_id, 'Carne Asada' AS taco_type
    UNION ALL
    SELECT 2 AS order_id, NULL AS taco_type
    UNION ALL
    SELECT 3 AS order_id, 'Pollo' AS taco_type
    UNION ALL
    SELECT 4 AS order_id, 'Al Pastor' AS taco_type
    UNION ALL
    SELECT 5 AS order_id, NULL AS taco_type
) AS taco_orders
ORDER BY order_id;

Result:

| last_taco_default  | last_taco_ignore_null | last_taco_respect_null |
|--------------------|-----------------------|------------------------|
| Al Pastor          | Al Pastor             | NULL                   |

In this example:

  • last_taco_default and last_taco_ignore_null both return ‘Al Pastor’, ignoring NULL values.
  • last_taco_respect_null returns NULL, as it’s the last value when considering NULL values.

When used without Window Functions, it’s important to use an ordered subquery to ensure consistent results.

max

Calculates the maximum value across a group of values.

Syntax:

max(expr)

Arguments:

  • expr: The expression to calculate the maximum value from. Can be of any data type that supports comparison operations.

Returns:

The maximum value of the same type as the input expression.

Example:

SELECT
    taco_type,
    max(price) AS max_price
FROM
    (
        SELECT 'Beef' AS taco_type, 7.99 AS price
        UNION ALL
        SELECT 'Beef' AS taco_type, 8.99 AS price
        UNION ALL
        SELECT 'Chicken' AS taco_type, 6.99 AS price
        UNION ALL
        SELECT 'Chicken' AS taco_type, 7.99 AS price
        UNION ALL
        SELECT 'Vegetarian' AS taco_type, 5.99 AS price
        UNION ALL
        SELECT 'Vegetarian' AS taco_type, 6.99 AS price
    ) AS taco_menu
GROUP BY
    taco_type;

Result:

| taco_type  | max_price |
|------------|-----------|
| Beef       | 8.99      |
| Chicken    | 7.99      |
| Vegetarian | 6.99      |

In this example, max(price) calculates the highest price for each type of taco on the menu.

If you need a non-aggregate function to choose the maximum of two values, use the greatest function instead:

min

Calculates the minimum value across a group of values.

Syntax:

min(expr)

Arguments:

  • expr: The expression to calculate the minimum for. Can be of any data type that supports comparison operations.

Returns:

The minimum value of the same type as the input expression.

Example:

SELECT
    taco_type,
    min(price) AS cheapest_price
FROM
    (
        SELECT 'Beef' AS taco_type, 7.99 AS price
        UNION ALL
        SELECT 'Beef' AS taco_type, 8.99 AS price
        UNION ALL
        SELECT 'Chicken' AS taco_type, 6.99 AS price
        UNION ALL
        SELECT 'Chicken' AS taco_type, 7.99 AS price
        UNION ALL
        SELECT 'Vegetarian' AS taco_type, 5.99 AS price
        UNION ALL
        SELECT 'Vegetarian' AS taco_type, 6.99 AS price
    ) AS taco_menu
GROUP BY
    taco_type;

Result:

| taco_type  | cheapest_price |
|------------|----------------|
| Chicken    | 6.99           |
| Beef       | 7.99           |
| Vegetarian | 5.99           |

In this example, min calculates the lowest price for each type of taco on the menu.

If you need a non-aggregate function to choose the minimum of two values, use the least function instead:

singleValueOrNull

Returns the single unique non-NULL value in a column, or NULL if there are zero or multiple distinct values.

Syntax:

singleValueOrNull(x)

Arguments:

  • x (Any, except Map, Array, or Tuple): A column of any data type, except Map, Array, or Tuple. The column can be Nullable.

Returns:

  • The unique non-NULL value if there’s only one in the column.
  • NULL if there are zero or multiple distinct values.

This function is often used to implement subquery operators, such as x = ALL (SELECT ...).

Example:

SELECT
	singleValueOrNull(taco_filling) AS unique_filling
FROM
(
    SELECT 'Carnitas' AS taco_filling
    UNION ALL
    SELECT 'Carnitas' AS taco_filling
    UNION ALL
    SELECT 'Carnitas' AS taco_filling
) AS taco_orders;

Result:

| unique_filling |
|----------------|
| Carnitas       |

In this example, if all non-NULL taco_filling values are “Carnitas”, the function returns “Carnitas”. If there are multiple different fillings or no fillings at all, it returns NULL.

This function ignores NULL values in the input column. Only non-NULL values are considered when determining uniqueness.

sum

Calculates the sum of numeric values.

Syntax:

sum(x)

Arguments:

  • x (numeric): Column with numeric data type values. Supported data types: Int, UInt, Float, Decimal.

Returns:

The sum of the values.

  • Type: Matches the input data type.

Example:

SELECT
    sum(taco_price) AS total_revenue
FROM
    (
        SELECT 7.99 AS taco_price
        UNION ALL
        SELECT 8.99 AS taco_price
        UNION ALL
        SELECT 6.99 AS taco_price
        UNION ALL
        SELECT 9.99 AS taco_price
        UNION ALL
        SELECT 7.99 AS taco_price
    ) AS taco_sales;

Result:

| total_revenue |
|---------------|
| 41.95         |

This example calculates the total revenue from taco sales by summing up all the taco_price values.

The sum function only works with numeric types. For other data types, consider using appropriate type conversions or alternative aggregation functions.