Bit functions

Bit functions in ClickHouse operate on various data types, including:

Unsigned integers: UInt8, UInt16, UInt32, UInt64
Signed integers: Int8, Int16, Int32, Int64
Floating-point numbers: Float32, Float64

Additionally, some bit functions extend their support to String and FixedString data types.

Result type and behavior

When you use a bit function, the result is always an integer. The bit length of this result is determined by the maximum bit length of the input arguments. This ensures that the output can accommodate the full range of possible values from the operation.

For example:

If you perform a bitwise operation on two UInt8 values, the result will be a UInt8.
If you operate on a UInt8 and a UInt32, the result will be a UInt32.

It’s important to note that the signedness of the result depends on the input arguments:

If any of the input arguments is a signed integer, the result will also be signed.
If all inputs are unsigned, the result will be unsigned.

Handling of floating-point numbers

When bit functions encounter floating-point arguments (Float32 or Float64), these are automatically converted to Int64 before the operation is performed. This conversion ensures that bitwise operations can be carried out consistently, regardless of the input number type.

ClickHouse function reference

bitAnd

Performs a bitwise AND operation on two numeric values.

Syntax:

bitAnd(a, b)

Arguments:

a (numeric): First numeric value.
b (numeric): Second numeric value.

Returns:

The result of the bitwise AND operation. [Integer]

Example:

SELECT
  taco_id,
  taco_flags,
  bitAnd(taco_flags, 3) AS has_cheese_and_lettuce
FROM
  taco_menu;

Result:

| taco_id | taco_flags | has_cheese_and_lettuce |
|---------|------------|------------------------|
| 1       | 5          | 1                      |
| 2       | 3          | 3                      |
| 3       | 6          | 2                      |

In this example, bitAnd is used to check if a taco has both cheese and lettuce. Assuming the flags are:

1 (0001): Has cheese
2 (0010): Has lettuce
4 (0100): Has salsa

The result 3 means both cheese and lettuce are present, 1 means only cheese, 2 means only lettuce, and 0 means neither.

bitOr

Performs a bitwise OR operation on two numeric values.

Syntax:

bitOr(a, b)

Arguments:

a (numeric): First numeric value.
b (numeric): Second numeric value.

Returns:

The result of the bitwise OR operation. [Integer]

Example:

SELECT
  taco_id,
  taco_flags1,
  taco_flags2,
  bitOr(taco_flags1, taco_flags2) AS combined_flags
FROM
  taco_menu;

Result:

| taco_id | taco_flags1 | taco_flags2 | combined_flags |
|---------|-------------|-------------|----------------|
| 1       | 1           | 2           | 3              |
| 2       | 4           | 1           | 5              |
| 3       | 2           | 4           | 6              |

In this example, bitOr combines two sets of taco flags. Assuming the flags are:

1 (0001): Has cheese
2 (0010): Has lettuce
4 (0100): Has salsa

The combined_flags column shows all toppings present on the taco from both flag sets.

bitXor

Performs a bitwise XOR (exclusive OR) operation on two numeric values.

Syntax:

bitXor(a, b)

Arguments:

a (numeric): First numeric value.
b (numeric): Second numeric value.

Returns:

The result of the bitwise XOR operation. [Integer]

Example:

SELECT
  taco_id,
  old_toppings,
  new_toppings,
  bitXor(old_toppings, new_toppings) AS changed_toppings
FROM
  taco_menu_updates;

Result:

| taco_id | old_toppings | new_toppings | changed_toppings  |
|---------|--------------|--------------|-------------------|
| 1       | 3            | 5            | 6                 |
| 2       | 2            | 3            | 1                 |
| 3       | 7            | 7            | 0                 |

In this example, bitXor is used to identify which toppings have changed. Assuming the flags are:

1 (001): Has cheese
2 (010): Has lettuce
4 (100): Has salsa

The changed_toppings column shows:

6 (110): Lettuce and salsa changed (added or removed)
1 (001): Cheese changed (added or removed)
0 (000): No changes in toppings

bitNot

Performs a bitwise NOT operation on a numeric value.

Syntax:

bitNot(a)

Arguments:

a (numeric): The numeric value to apply the bitwise NOT operation to.

Returns:

The result of the bitwise NOT operation. [Integer]

Example:

SELECT
  taco_id,
  taco_flags,
  bitNot(taco_flags) AS inverted_flags
FROM
  taco_menu;

Result:

| taco_id | taco_flags | inverted_flags |
|---------|------------|----------------|
| 1       | 3          | -4             |
| 2       | 5          | -6             |
| 3       | 0          | -1             |

In this example, bitNot inverts all the bits in taco_flags. Assuming we’re using 8-bit integers:

3 (00000011) becomes -4 (11111100)
5 (00000101) becomes -6 (11111010)
0 (00000000) becomes -1 (11111111)

This operation could be used to find missing toppings or to toggle all flags at once.

bitShiftLeft

Shifts the binary representation of a value to the left by a specified number of bit positions.

Syntax:

bitShiftLeft(a, b)

Arguments:

a (numeric): The value to shift.
b (numeric): The number of positions to shift.

Returns:

The shifted value. [Integer]

Example:

SELECT
  taco_id,
  base_price,
  bitShiftLeft(base_price, 1) AS doubled_price,
  bitShiftLeft(base_price, 2) AS quadrupled_price
FROM
  taco_menu;

Result:

| taco_id | base_price | doubled_price | quadrupled_price  |
|---------|------------|---------------|-------------------|
| 1       | 5          | 10            | 20                |
| 2       | 3          | 6             | 12                |
| 3       | 4          | 8             | 16                |

In this example, bitShiftLeft is used to quickly calculate doubled and quadrupled prices:

Shifting left by 1 position multiplies the value by 2
Shifting left by 2 positions multiplies the value by 4

This operation can be useful for quick price calculations or for working with binary flags.

bitShiftRight

Shifts the binary representation of a value to the right by a specified number of bit positions.

Syntax:

bitShiftRight(a, b)

Arguments:

a (numeric): The value to shift.
b (numeric): The number of positions to shift.

Returns:

The shifted value. [Integer]

Example:

SELECT
  taco_id,
  total_price,
  bitShiftRight(total_price, 1) AS half_price,
  bitShiftRight(total_price, 2) AS quarter_price
FROM
  taco_orders;

Result:

| taco_id | total_price | half_price | quarter_price |
|---------|-------------|------------|---------------|
| 1       | 20          | 10         | 5             |
| 2       | 15          | 7          | 3             |
| 3       | 25          | 12         | 6             |

In this example, bitShiftRight is used to quickly calculate discounted prices:

Shifting right by 1 position divides the value by 2 (rounding down)
Shifting right by 2 positions divides the value by 4 (rounding down)

This operation can be useful for quick price calculations or for working with binary flags. Note that this operation always rounds down, which may not be suitable for all pricing scenarios.

bitRotateLeft

Rotates the bits of a numeric value to the left by a specified number of positions.

Syntax:

bitRotateLeft(a, b)

Arguments:

a (numeric): The value to rotate.
b (numeric): The number of positions to rotate.

Returns:

The rotated value. [Integer]

Example:

SELECT
  taco_id,
  taco_flags,
  bitRotateLeft(taco_flags, 1) AS rotated_flags
FROM
  taco_menu;

Result:

| taco_id | taco_flags | rotated_flags  |
|---------|------------|----------------|
| 1       | 5          | 10             |
| 2       | 3          | 6              |
| 3       | 7          | 14             |

In this example, bitRotateLeft is used to rotate the taco flags. Assuming we’re using 4-bit integers:

5 (0101) becomes 10 (1010)
3 (0011) becomes 6 (0110)
7 (0111) becomes 14 (1110)

This operation can be useful for cycling through different combinations of flags or for certain types of encryption algorithms.

bitRotateRight

Rotates the bits of a numeric value to the right by a specified number of positions.

Syntax:

bitRotateRight(a, b)

Arguments:

a (numeric): The value to rotate.
b (numeric): The number of positions to rotate.

Returns:

The rotated value. [Integer]

Example:

SELECT
  taco_id,
  taco_flags,
  bitRotateRight(taco_flags, 1) AS rotated_flags
FROM
  taco_menu;

Result:

| taco_id | taco_flags | rotated_flags  |
|---------|------------|----------------|
| 1       | 5          | 10             |
| 2       | 3          | 9              |
| 3       | 7          | 11             |

In this example, bitRotateRight is used to rotate the taco flags. Assuming we’re using 4-bit integers:

5 (0101) becomes 10 (1010)
3 (0011) becomes 9 (1001)
7 (0111) becomes 11 (1011)

This operation can be useful for cycling through different combinations of flags or for certain types of encryption algorithms. It’s the opposite of bitRotateLeft.

bitSlice

Returns a substring of bits from a given string, starting at a specified offset and with a specified length.

Syntax:

bitSlice(s, offset, length)

Arguments:

s (String or FixedString): The input string.
offset (Integer): The starting bit position (1-based index).
length (Integer): The number of bits to extract.

Returns:

A string containing the extracted bits. [String]

Example:

SELECT
  taco_id,
  taco_binary_data,
  bitSlice(taco_binary_data, 9, 8) AS extracted_flavor
FROM
  taco_recipes;

Result:

| taco_id | taco_binary_data | extracted_flavor  |
|---------|-------------------|------------------|
| 1       | \x48656C6C6F     | \x65              |
| 2       | \x54616368       | \x61              |
| 3       | \x576F726C64     | \x6F              |

In this example, bitSlice extracts 8 bits (1 byte) starting from the 9th bit of the taco_binary_data. This could represent extracting a specific flavor code from a binary recipe representation.

The result is padded with zeros on the right if the requested length exceeds the available bits. If the offset is negative, it’s counted from the end of the string.

byteSlice

Returns a substring of bytes from a given string, starting at a specified offset and with a specified length.

Syntax:

byteSlice(s, offset, length)

Arguments:

s (String or FixedString): The input string.
offset (Integer): The starting byte position (1-based index).
length (Integer): The number of bytes to extract.

Returns:

A string containing the extracted bytes. [String]

Example:

SELECT
  taco_id,
  taco_recipe,
  byteSlice(taco_recipe, 2, 3) AS extracted_ingredients
FROM
  taco_recipes;

Result:

| taco_id | taco_recipe       | extracted_ingredients |
|---------|-------------------|-----------------------|
| 1       | BeefLettuceTomato | eef                   |
| 2       | ChickenSalsaGuac  | hic                   |
| 3       | FishCabbageLime   | ish                   |

In this example, byteSlice extracts 3 bytes starting from the 2nd byte of the taco_recipe. This could represent extracting specific ingredients or codes from a recipe string.

This function is similar to the substring function but operates on a byte level rather than a character level, which can be useful for binary data or when working with multi-byte character encodings.

bitTest

Tests whether a specific bit is set in a given numeric value.

Syntax:

bitTest(x, n)

Arguments:

x (numeric): The value to test.
n (numeric): The bit position to test (0-based index, counting from the least significant bit).

Returns:

1 if the bit is set, 0 otherwise. [UInt8]

Example:

SELECT
  taco_id,
  taco_flags,
  bitTest(taco_flags, 0) AS has_cheese,
  bitTest(taco_flags, 1) AS has_lettuce,
  bitTest(taco_flags, 2) AS has_salsa
FROM
  taco_menu;

Result:

| taco_id | taco_flags | has_cheese | has_lettuce | has_salsa |
|---------|------------|------------|-------------|-----------|
| 1       | 5          | 1          | 0           | 1         |
| 2       | 3          | 1          | 1           | 0         |
| 3       | 4          | 0          | 0           | 1         |

In this example, bitTest is used to check individual toppings on each taco:

Bit 0 (value 1) represents cheese
Bit 1 (value 2) represents lettuce
Bit 2 (value 4) represents salsa

The function returns 1 if the topping is present and 0 if it’s not.

bitTestAll

Performs a logical AND operation on all bits at specified positions in a number.

Syntax:

bitTestAll(number, index1, index2, ...)

Arguments:

number (Integer): The number to test.
index1, index2, … (Integer): Bit positions to test, starting from 0 (rightmost bit).

Returns:

1 if all specified bits are set to 1, 0 otherwise. [UInt8]

Example:

SELECT
  taco_order_id,
  taco_flags,
  bitTestAll(taco_flags, 0, 2, 4) AS has_all_toppings
FROM
  taco_orders;

Result:

| taco_order_id | taco_flags | has_all_toppings  |
|---------------|------------|-------------------|
| 1             | 21         | 1                 |
| 2             | 17         | 0                 |
| 3             | 31         | 1                 |

In this example, bitTestAll checks if bits 0, 2, and 4 are all set in taco_flags. The binary representation of the flags is:

21: 10101 (has all specified toppings)
17: 10001 (missing topping at position 2)
31: 11111 (has all toppings)

The function returns 1 for orders 1 and 3, where all specified bits are set, and 0 for order 2, where one bit is not set.

bitTestAny

Returns the result of a logical OR operation on all bits at the specified positions in a number.

Syntax:

bitTestAny(number, index1, index2, ...)

Arguments:

number (Integer): The number to test.
index1, index2, … (Integer): The bit positions to test. Counting starts from 0, right to left.

Returns:

1 if any of the specified bits are set to 1, 0 otherwise. [UInt8]

Example:

SELECT
  taco_order_id,
  taco_flags,
  bitTestAny(taco_flags, 0, 2) AS has_cheese_or_salsa
FROM
  taco_orders;

Result:

| taco_order_id | taco_flags | has_cheese_or_salsa |
|---------------|------------|---------------------|
| 1             | 5          | 1                   |
| 2             | 2          | 1                   |
| 3             | 8          | 0                   |

In this example, bitTestAny checks if either bit 0 (cheese) or bit 2 (salsa) is set in the taco_flags field. The binary representation of the flags is:

5 (101 in binary): Has cheese (bit 0) and salsa (bit 2)
2 (010 in binary): Has lettuce (bit 1)
8 (1000 in binary): Has guacamole (bit 3)

The function returns 1 for orders 1 and 2 because they have either cheese or salsa (or both), and 0 for order 3 which has neither.

bitCount

Calculates the number of bits set to one in the binary representation of a number.

Syntax:

bitCount(x)

Arguments:

x (Integer or Float): The number to count bits for. The function uses the value’s binary representation in memory.

Returns:

The number of bits set to 1 in the input number. [UInt8]

Example:

SELECT
  taco_order_id,
  bitCount(taco_order_id) AS bits_set
FROM
  taco_orders
LIMIT 3;

Result:

| taco_order_id | bits_set |
|---------------|----------|
| 42            | 3        |
| 255           | 8        |
| 1024          | 1        |

In this example, bitCount calculates the number of bits set to 1 in each taco_order_id:

42 in binary is 101010, which has 3 bits set to 1
255 in binary is 11111111, which has all 8 bits set to 1
1024 in binary is 10000000000, which has only 1 bit set to 1

The function does not perform sign extension, so bitCount(toUInt8(-1)) would return 8, as all bits in a UInt8 are set to 1 for the value -1.

bitHammingDistance

Calculates the Hamming distance between the bit representations of two 64-bit integer values.

Syntax:

bitHammingDistance(a, b)

Arguments:

a (Int64): First 64-bit integer value.
b (Int64): Second 64-bit integer value.

Returns:

The Hamming distance between a and b. [UInt8]

Example:

SELECT
  taco_id1,
  taco_id2,
  bitHammingDistance(taco_id1, taco_id2) AS hamming_distance
FROM
  taco_comparisons;

Result:

| taco_id1 | taco_id2 | hamming_distance |
|----------|----------|------------------|
| 111      | 121      | 3                |
| 333      | 555      | 4                |
| 777      | 888      | 2                |

In this example, bitHammingDistance calculates the number of bit positions at which the binary representations of taco_id1 and taco_id2 differ. This can be useful for comparing similarity between taco identifiers or other numeric values.

The Hamming distance can be used with SimHash functions for detecting semi-duplicate strings. A smaller distance suggests a higher likelihood of similarity between the compared values.

Introduction

Statements

Result type and behavior

Handling of floating-point numbers

ClickHouse function reference

bitAnd

bitOr

bitXor

bitNot

bitShiftLeft

bitShiftRight

bitRotateLeft

bitRotateRight

bitSlice

byteSlice

bitTest

bitTestAll

bitTestAny

bitCount

bitHammingDistance

Introduction

Statements

​Result type and behavior

​Handling of floating-point numbers

​ClickHouse function reference

​bitAnd

​bitOr

​bitXor

​bitNot

​bitShiftLeft

​bitShiftRight

​bitRotateLeft

​bitRotateRight

​bitSlice

​byteSlice

​bitTest

​bitTestAll

​bitTestAny

​bitCount

​bitHammingDistance

Result type and behavior

Handling of floating-point numbers

ClickHouse function reference

bitAnd

bitOr

bitXor

bitNot

bitShiftLeft

bitShiftRight

bitRotateLeft

bitRotateRight

bitSlice

byteSlice

bitTest

bitTestAll

bitTestAny

bitCount

bitHammingDistance