# DataFusion Functions

> Generated from the Apache DataFusion project's documents, this page lists and describes DataFusion functions, including scalar, window, and array functions.

# DataFusion Functions

This page is generated from the [Apache DataFusion](https://datafusion.apache.org/user-guide/sql/) project's documents:
  * [DataFusion Scalar Functions](#scalar-functions)
  * [DataFusion Aggregate Functions](#aggregate-functions)
  * [DataFusion Window Functions](#window-functions)
  * [DataFusion Special Functions](#special-functions)

<!---
  Licensed to the Apache Software Foundation (ASF) under one
  or more contributor license agreements.  See the NOTICE file
  distributed with this work for additional information
  regarding copyright ownership.  The ASF licenses this file
  to you under the Apache License, Version 2.0 (the
  "License"); you may not use this file except in compliance
  with the License.  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing,
  software distributed under the License is distributed on an
  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
  KIND, either express or implied.  See the License for the
  specific language governing permissions and limitations
  under the License.
-->

<!---
This file was generated by the dev/update_function_docs.sh script.
Do not edit it manually as changes will be overwritten.
Instead, edit the ScalarUDFImpl's documentation() function to
update documentation for an individual UDF or the
dev/update_function_docs.sh file for updating surrounding text.
-->

## Scalar Functions

### Math Functions

- [abs](#abs)
- [acos](#acos)
- [acosh](#acosh)
- [asin](#asin)
- [asinh](#asinh)
- [atan](#atan)
- [atan2](#atan2)
- [atanh](#atanh)
- [cbrt](#cbrt)
- [ceil](#ceil)
- [cos](#cos)
- [cosh](#cosh)
- [cot](#cot)
- [degrees](#degrees)
- [exp](#exp)
- [factorial](#factorial)
- [floor](#floor)
- [gcd](#gcd)
- [isnan](#isnan)
- [iszero](#iszero)
- [lcm](#lcm)
- [ln](#ln)
- [log](#log)
- [log10](#log10)
- [log2](#log2)
- [nanvl](#nanvl)
- [pi](#pi)
- [pow](#pow)
- [power](#power)
- [radians](#radians)
- [random](#random)
- [round](#round)
- [signum](#signum)
- [sin](#sin)
- [sinh](#sinh)
- [sqrt](#sqrt)
- [tan](#tan)
- [tanh](#tanh)
- [trunc](#trunc)

##### `abs`

Returns the absolute value of a number.

```sql
abs(numeric_expression)
```

###### Arguments

- **numeric_expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT abs(-5);
+----------+
| abs(-5)  |
+----------+
| 5        |
+----------+
```

##### `acos`

Returns the arc cosine or inverse cosine of a number.

```sql
acos(numeric_expression)
```

###### Arguments

- **numeric_expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT acos(1);
+----------+
| acos(1)  |
+----------+
| 0.0      |
+----------+
```

##### `acosh`

Returns the area hyperbolic cosine or inverse hyperbolic cosine of a number.

```sql
acosh(numeric_expression)
```

###### Arguments

- **numeric_expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT acosh(2);
+------------+
| acosh(2)   |
+------------+
| 1.31696    |
+------------+
```

##### `asin`

Returns the arc sine or inverse sine of a number.

```sql
asin(numeric_expression)
```

###### Arguments

- **numeric_expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT asin(0.5);
+------------+
| asin(0.5)  |
+------------+
| 0.5235988  |
+------------+
```

##### `asinh`

Returns the area hyperbolic sine or inverse hyperbolic sine of a number.

```sql
asinh(numeric_expression)
```

###### Arguments

- **numeric_expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT asinh(1);
+------------+
| asinh(1)   |
+------------+
| 0.8813736  |
+------------+
```

##### `atan`

Returns the arc tangent or inverse tangent of a number.

```sql
atan(numeric_expression)
```

###### Arguments

- **numeric_expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
    > SELECT atan(1);
+-----------+
| atan(1)   |
+-----------+
| 0.7853982 |
+-----------+
```

##### `atan2`

Returns the arc tangent or inverse tangent of `expression_y / expression_x`.

```sql
atan2(expression_y, expression_x)
```

###### Arguments

- **expression_y**: First numeric expression to operate on.
  Can be a constant, column, or function, and any combination of arithmetic operators.
- **expression_x**: Second numeric expression to operate on.
  Can be a constant, column, or function, and any combination of arithmetic operators.

###### Example

```sql
> SELECT atan2(1, 1);
+------------+
| atan2(1,1) |
+------------+
| 0.7853982  |
+------------+
```

##### `atanh`

Returns the area hyperbolic tangent or inverse hyperbolic tangent of a number.

```sql
atanh(numeric_expression)
```

###### Arguments

- **numeric_expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
    > SELECT atanh(0.5);
+-------------+
| atanh(0.5)  |
+-------------+
| 0.5493061   |
+-------------+
```

##### `cbrt`

Returns the cube root of a number.

```sql
cbrt(numeric_expression)
```

###### Arguments

- **numeric_expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT cbrt(27);
+-----------+
| cbrt(27)  |
+-----------+
| 3.0       |
+-----------+
```

##### `ceil`

Returns the nearest integer greater than or equal to a number.

```sql
ceil(numeric_expression)
```

###### Arguments

- **numeric_expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT ceil(3.14);
+------------+
| ceil(3.14) |
+------------+
| 4.0        |
+------------+
```

##### `cos`

Returns the cosine of a number.

```sql
cos(numeric_expression)
```

###### Arguments

- **numeric_expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT cos(0);
+--------+
| cos(0) |
+--------+
| 1.0    |
+--------+
```

##### `cosh`

Returns the hyperbolic cosine of a number.

```sql
cosh(numeric_expression)
```

###### Arguments

- **numeric_expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT cosh(1);
+-----------+
| cosh(1)   |
+-----------+
| 1.5430806 |
+-----------+
```

##### `cot`

Returns the cotangent of a number.

```sql
cot(numeric_expression)
```

###### Arguments

- **numeric_expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT cot(1);
+---------+
| cot(1)  |
+---------+
| 0.64209 |
+---------+
```

##### `degrees`

Converts radians to degrees.

```sql
degrees(numeric_expression)
```

###### Arguments

- **numeric_expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
    > SELECT degrees(pi());
+------------+
| degrees(0) |
+------------+
| 180.0      |
+------------+
```

##### `exp`

Returns the base-e exponential of a number.

```sql
exp(numeric_expression)
```

###### Arguments

- **numeric_expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT exp(1);
+---------+
| exp(1)  |
+---------+
| 2.71828 |
+---------+
```

##### `factorial`

Factorial. Returns 1 if value is less than 2.

```sql
factorial(numeric_expression)
```

###### Arguments

- **numeric_expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT factorial(5);
+---------------+
| factorial(5)  |
+---------------+
| 120           |
+---------------+
```

##### `floor`

Returns the nearest integer less than or equal to a number.

```sql
floor(numeric_expression)
```

###### Arguments

- **numeric_expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT floor(3.14);
+-------------+
| floor(3.14) |
+-------------+
| 3.0         |
+-------------+
```

##### `gcd`

Returns the greatest common divisor of `expression_x` and `expression_y`. Returns 0 if both inputs are zero.

```sql
gcd(expression_x, expression_y)
```

###### Arguments

- **expression_x**: First numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **expression_y**: Second numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT gcd(48, 18);
+------------+
| gcd(48,18) |
+------------+
| 6          |
+------------+
```

##### `isnan`

Returns true if a given number is +NaN or -NaN otherwise returns false.

```sql
isnan(numeric_expression)
```

###### Arguments

- **numeric_expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT isnan(1);
+----------+
| isnan(1) |
+----------+
| false    |
+----------+
```

##### `iszero`

Returns true if a given number is +0.0 or -0.0 otherwise returns false.

```sql
iszero(numeric_expression)
```

###### Arguments

- **numeric_expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT iszero(0);
+------------+
| iszero(0)  |
+------------+
| true       |
+------------+
```

##### `lcm`

Returns the least common multiple of `expression_x` and `expression_y`. Returns 0 if either input is zero.

```sql
lcm(expression_x, expression_y)
```

###### Arguments

- **expression_x**: First numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **expression_y**: Second numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT lcm(4, 5);
+----------+
| lcm(4,5) |
+----------+
| 20       |
+----------+
```

##### `ln`

Returns the natural logarithm of a number.

```sql
ln(numeric_expression)
```

###### Arguments

- **numeric_expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT ln(2.71828);
+-------------+
| ln(2.71828) |
+-------------+
| 1.0         |
+-------------+
```

##### `log`

Returns the base-x logarithm of a number. Can either provide a specified base, or if omitted then takes the base-10 of a number.

```sql
log(base, numeric_expression)
log(numeric_expression)
```

###### Arguments

- **base**: Base numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **numeric_expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT log(10);
+---------+
| log(10) |
+---------+
| 1.0     |
+---------+
```

##### `log10`

Returns the base-10 logarithm of a number.

```sql
log10(numeric_expression)
```

###### Arguments

- **numeric_expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT log10(100);
+-------------+
| log10(100)  |
+-------------+
| 2.0         |
+-------------+
```

##### `log2`

Returns the base-2 logarithm of a number.

```sql
log2(numeric_expression)
```

###### Arguments

- **numeric_expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT log2(8);
+-----------+
| log2(8)   |
+-----------+
| 3.0       |
+-----------+
```

##### `nanvl`

Returns the first argument if it's not _NaN_.
Returns the second argument otherwise.

```sql
nanvl(expression_x, expression_y)
```

###### Arguments

- **expression_x**: Numeric expression to return if it's not _NaN_. Can be a constant, column, or function, and any combination of arithmetic operators.
- **expression_y**: Numeric expression to return if the first expression is _NaN_. Can be a constant, column, or function, and any combination of arithmetic operators.

###### Example

```sql
> SELECT nanvl(0, 5);
+------------+
| nanvl(0,5) |
+------------+
| 0          |
+------------+
```

##### `pi`

Returns an approximate value of π.

```sql
pi()
```

##### `pow`

_Alias of [power](#power)._

##### `power`

Returns a base expression raised to the power of an exponent.

```sql
power(base, exponent)
```

###### Arguments

- **base**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **exponent**: Exponent numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT power(2, 3);
+-------------+
| power(2,3)  |
+-------------+
| 8           |
+-------------+
```

###### Aliases

- pow

##### `radians`

Converts degrees to radians.

```sql
radians(numeric_expression)
```

###### Arguments

- **numeric_expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT radians(180);
+----------------+
| radians(180)   |
+----------------+
| 3.14159265359  |
+----------------+
```

##### `random`

Returns a random float value in the range [0, 1).
The random seed is unique to each row.

```sql
random()
```

###### Example

```sql
> SELECT random();
+------------------+
| random()         |
+------------------+
| 0.7389238902938  |
+------------------+
```

##### `round`

Rounds a number to the nearest integer.

```sql
round(numeric_expression[, decimal_places])
```

###### Arguments

- **numeric_expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **decimal_places**: Optional. The number of decimal places to round to. Defaults to 0.

###### Example

```sql
> SELECT round(3.14159);
+--------------+
| round(3.14159)|
+--------------+
| 3.0          |
+--------------+
```

##### `signum`

Returns the sign of a number.
Negative numbers return `-1`.
Zero and positive numbers return `1`.

```sql
signum(numeric_expression)
```

###### Arguments

- **numeric_expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT signum(-42);
+-------------+
| signum(-42) |
+-------------+
| -1          |
+-------------+
```

##### `sin`

Returns the sine of a number.

```sql
sin(numeric_expression)
```

###### Arguments

- **numeric_expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT sin(0);
+----------+
| sin(0)   |
+----------+
| 0.0      |
+----------+
```

##### `sinh`

Returns the hyperbolic sine of a number.

```sql
sinh(numeric_expression)
```

###### Arguments

- **numeric_expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT sinh(1);
+-----------+
| sinh(1)   |
+-----------+
| 1.1752012 |
+-----------+
```

##### `sqrt`

Returns the square root of a number.

```sql
sqrt(numeric_expression)
```

###### Arguments

- **numeric_expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

##### `tan`

Returns the tangent of a number.

```sql
tan(numeric_expression)
```

###### Arguments

- **numeric_expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT tan(pi()/4);
+--------------+
| tan(PI()/4)  |
+--------------+
| 1.0          |
+--------------+
```

##### `tanh`

Returns the hyperbolic tangent of a number.

```sql
tanh(numeric_expression)
```

###### Arguments

- **numeric_expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
  > SELECT tanh(20);
  +----------+
  | tanh(20) |
  +----------+
  | 1.0      |
  +----------+
```

##### `trunc`

Truncates a number to a whole number or truncated to the specified decimal places.

```sql
trunc(numeric_expression[, decimal_places])
```

###### Arguments

- **numeric_expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **decimal_places**: Optional. The number of decimal places to
  truncate to. Defaults to 0 (truncate to a whole number). If
  `decimal_places` is a positive integer, truncates digits to the
  right of the decimal point. If `decimal_places` is a negative
  integer, replaces digits to the left of the decimal point with `0`.

###### Example

```sql
> SELECT trunc(42.738);
+----------------+
| trunc(42.738)  |
+----------------+
| 42             |
+----------------+
```

### Conditional Functions

- [coalesce](#coalesce)
- [greatest](#greatest)
- [ifnull](#ifnull)
- [least](#least)
- [nullif](#nullif)
- [nvl](#nvl)
- [nvl2](#nvl2)

##### `coalesce`

Returns the first of its arguments that is not _null_. Returns _null_ if all arguments are _null_. This function is often used to substitute a default value for _null_ values.

```sql
coalesce(expression1[, ..., expression_n])
```

###### Arguments

- **expression1, expression_n**: Expression to use if previous expressions are _null_. Can be a constant, column, or function, and any combination of arithmetic operators. Pass as many expression arguments as necessary.

###### Example

```sql
> select coalesce(null, null, 'datafusion');
+----------------------------------------+
| coalesce(NULL,NULL,Utf8("datafusion")) |
+----------------------------------------+
| datafusion                             |
+----------------------------------------+
```

##### `greatest`

Returns the greatest value in a list of expressions. Returns _null_ if all expressions are _null_.

```sql
greatest(expression1[, ..., expression_n])
```

###### Arguments

- **expression1, expression_n**: Expressions to compare and return the greatest value.. Can be a constant, column, or function, and any combination of arithmetic operators. Pass as many expression arguments as necessary.

###### Example

```sql
> select greatest(4, 7, 5);
+---------------------------+
| greatest(4,7,5)           |
+---------------------------+
| 7                         |
+---------------------------+
```

##### `ifnull`

_Alias of [nvl](#nvl)._

##### `least`

Returns the smallest value in a list of expressions. Returns _null_ if all expressions are _null_.

```sql
least(expression1[, ..., expression_n])
```

###### Arguments

- **expression1, expression_n**: Expressions to compare and return the smallest value. Can be a constant, column, or function, and any combination of arithmetic operators. Pass as many expression arguments as necessary.

###### Example

```sql
> select least(4, 7, 5);
+---------------------------+
| least(4,7,5)              |
+---------------------------+
| 4                         |
+---------------------------+
```

##### `nullif`

Returns _null_ if _expression1_ equals _expression2_; otherwise it returns _expression1_.
This can be used to perform the inverse operation of [`coalesce`](#coalesce).

```sql
nullif(expression1, expression2)
```

###### Arguments

- **expression1**: Expression to compare and return if equal to expression2. Can be a constant, column, or function, and any combination of operators.
- **expression2**: Expression to compare to expression1. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> select nullif('datafusion', 'data');
+-----------------------------------------+
| nullif(Utf8("datafusion"),Utf8("data")) |
+-----------------------------------------+
| datafusion                              |
+-----------------------------------------+
> select nullif('datafusion', 'datafusion');
+-----------------------------------------------+
| nullif(Utf8("datafusion"),Utf8("datafusion")) |
+-----------------------------------------------+
|                                               |
+-----------------------------------------------+
```

##### `nvl`

Returns _expression2_ if _expression1_ is NULL otherwise it returns _expression1_ and _expression2_ is not evaluated. This function can be used to substitute a default value for NULL values.

```sql
nvl(expression1, expression2)
```

###### Arguments

- **expression1**: Expression to return if not null. Can be a constant, column, or function, and any combination of operators.
- **expression2**: Expression to return if expr1 is null. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> select nvl(null, 'a');
+---------------------+
| nvl(NULL,Utf8("a")) |
+---------------------+
| a                   |
+---------------------+\
> select nvl('b', 'a');
+--------------------------+
| nvl(Utf8("b"),Utf8("a")) |
+--------------------------+
| b                        |
+--------------------------+
```

###### Aliases

- ifnull

##### `nvl2`

Returns _expression2_ if _expression1_ is not NULL; otherwise it returns _expression3_.

```sql
nvl2(expression1, expression2, expression3)
```

###### Arguments

- **expression1**: Expression to test for null. Can be a constant, column, or function, and any combination of operators.
- **expression2**: Expression to return if expr1 is not null. Can be a constant, column, or function, and any combination of operators.
- **expression3**: Expression to return if expr1 is null. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> select nvl2(null, 'a', 'b');
+--------------------------------+
| nvl2(NULL,Utf8("a"),Utf8("b")) |
+--------------------------------+
| b                              |
+--------------------------------+
> select nvl2('data', 'a', 'b');
+----------------------------------------+
| nvl2(Utf8("data"),Utf8("a"),Utf8("b")) |
+----------------------------------------+
| a                                      |
+----------------------------------------+
```

### String Functions

- [ascii](#ascii)
- [bit_length](#bit_length)
- [btrim](#btrim)
- [char_length](#char_length)
- [character_length](#character_length)
- [chr](#chr)
- [concat](#concat)
- [concat_ws](#concat_ws)
- [contains](#contains)
- [ends_with](#ends_with)
- [find_in_set](#find_in_set)
- [initcap](#initcap)
- [instr](#instr)
- [left](#left)
- [length](#length)
- [levenshtein](#levenshtein)
- [lower](#lower)
- [lpad](#lpad)
- [ltrim](#ltrim)
- [octet_length](#octet_length)
- [overlay](#overlay)
- [position](#position)
- [repeat](#repeat)
- [replace](#replace)
- [reverse](#reverse)
- [right](#right)
- [rpad](#rpad)
- [rtrim](#rtrim)
- [split_part](#split_part)
- [starts_with](#starts_with)
- [strpos](#strpos)
- [substr](#substr)
- [substr_index](#substr_index)
- [substring](#substring)
- [substring_index](#substring_index)
- [to_hex](#to_hex)
- [translate](#translate)
- [trim](#trim)
- [upper](#upper)
- [uuid](#uuid)

##### `ascii`

Returns the first Unicode scalar value of a string.

```sql
ascii(str)
```

###### Arguments

- **str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> select ascii('abc');
+--------------------+
| ascii(Utf8("abc")) |
+--------------------+
| 97                 |
+--------------------+
> select ascii('🚀');
+-------------------+
| ascii(Utf8("🚀")) |
+-------------------+
| 128640            |
+-------------------+
```

**Related functions**:

- [chr](#chr)

##### `bit_length`

Returns the bit length of a string.

```sql
bit_length(str)
```

###### Arguments

- **str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> select bit_length('datafusion');
+--------------------------------+
| bit_length(Utf8("datafusion")) |
+--------------------------------+
| 80                             |
+--------------------------------+
```

**Related functions**:

- [length](#length)
- [octet_length](#octet_length)

##### `btrim`

Trims the specified trim string from the start and end of a string. If no trim string is provided, all spaces are removed from the start and end of the input string.

```sql
btrim(str[, trim_str])
```

###### Arguments

- **str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **trim_str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators. _Default is a space._

###### Example

```sql
> select btrim('__datafusion____', '_');
+-------------------------------------------+
| btrim(Utf8("__datafusion____"),Utf8("_")) |
+-------------------------------------------+
| datafusion                                |
+-------------------------------------------+
```

###### Alternative Syntax

```sql
trim(BOTH trim_str FROM str)
```

```sql
trim(trim_str FROM str)
```

###### Aliases

- trim

**Related functions**:

- [ltrim](#ltrim)
- [rtrim](#rtrim)

##### `char_length`

_Alias of [character_length](#character_length)._

##### `character_length`

Returns the number of characters in a string.

```sql
character_length(str)
```

###### Arguments

- **str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> select character_length('Ångström');
+------------------------------------+
| character_length(Utf8("Ångström")) |
+------------------------------------+
| 8                                  |
+------------------------------------+
```

###### Aliases

- length
- char_length

**Related functions**:

- [bit_length](#bit_length)
- [octet_length](#octet_length)

##### `chr`

Returns a string containing the character with the specified Unicode scalar value.

```sql
chr(expression)
```

###### Arguments

- **expression**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> select chr(128640);
+--------------------+
| chr(Int64(128640)) |
+--------------------+
| 🚀                 |
+--------------------+
```

**Related functions**:

- [ascii](#ascii)

##### `concat`

Concatenates multiple strings together.

```sql
concat(str[, ..., str_n])
```

###### Arguments

- **str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **str_n**: Subsequent string expressions to concatenate.

###### Example

```sql
> select concat('data', 'f', 'us', 'ion');
+-------------------------------------------------------+
| concat(Utf8("data"),Utf8("f"),Utf8("us"),Utf8("ion")) |
+-------------------------------------------------------+
| datafusion                                            |
+-------------------------------------------------------+
```

**Related functions**:

- [concat_ws](#concat_ws)

##### `concat_ws`

Concatenates multiple strings together with a specified separator.

```sql
concat_ws(separator, str[, ..., str_n])
```

###### Arguments

- **separator**: Separator to insert between concatenated strings.
- **str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **str_n**: Subsequent string expressions to concatenate.

###### Example

```sql
> select concat_ws('_', 'data', 'fusion');
+--------------------------------------------------+
| concat_ws(Utf8("_"),Utf8("data"),Utf8("fusion")) |
+--------------------------------------------------+
| data_fusion                                      |
+--------------------------------------------------+
```

**Related functions**:

- [concat](#concat)

##### `contains`

Return true if search_str is found within string (case-sensitive).

```sql
contains(str, search_str)
```

###### Arguments

- **str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **search_str**: The string to search for in str.

###### Example

```sql
> select contains('the quick brown fox', 'row');
+---------------------------------------------------+
| contains(Utf8("the quick brown fox"),Utf8("row")) |
+---------------------------------------------------+
| true                                              |
+---------------------------------------------------+
```

##### `ends_with`

Tests if a string ends with a substring.

```sql
ends_with(str, substr)
```

###### Arguments

- **str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **substr**: Substring to test for.

###### Example

```sql
>  select ends_with('datafusion', 'soin');
+--------------------------------------------+
| ends_with(Utf8("datafusion"),Utf8("soin")) |
+--------------------------------------------+
| false                                      |
+--------------------------------------------+
> select ends_with('datafusion', 'sion');
+--------------------------------------------+
| ends_with(Utf8("datafusion"),Utf8("sion")) |
+--------------------------------------------+
| true                                       |
+--------------------------------------------+
```

##### `find_in_set`

Returns a value in the range of 1 to N if the string str is in the string list strlist consisting of N substrings.

```sql
find_in_set(str, strlist)
```

###### Arguments

- **str**: String expression to find in strlist.
- **strlist**: A string list is a string composed of substrings separated by , characters.

###### Example

```sql
> select find_in_set('b', 'a,b,c,d');
+----------------------------------------+
| find_in_set(Utf8("b"),Utf8("a,b,c,d")) |
+----------------------------------------+
| 2                                      |
+----------------------------------------+
```

##### `initcap`

Capitalizes the first character in each word in the input string. Words are delimited by non-alphanumeric characters.

```sql
initcap(str)
```

###### Arguments

- **str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> select initcap('apache datafusion');
+------------------------------------+
| initcap(Utf8("apache datafusion")) |
+------------------------------------+
| Apache Datafusion                  |
+------------------------------------+
```

**Related functions**:

- [lower](#lower)
- [upper](#upper)

##### `instr`

_Alias of [strpos](#strpos)._

##### `left`

Returns a specified number of characters from the left side of a string.

```sql
left(str, n)
```

###### Arguments

- **str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **n**: Number of characters to return.

###### Example

```sql
> select left('datafusion', 4);
+-----------------------------------+
| left(Utf8("datafusion"),Int64(4)) |
+-----------------------------------+
| data                              |
+-----------------------------------+
```

**Related functions**:

- [right](#right)

##### `length`

_Alias of [character_length](#character_length)._

##### `levenshtein`

Returns the [`Levenshtein distance`](https://en.wikipedia.org/wiki/Levenshtein_distance) between the two given strings.

```sql
levenshtein(str1, str2)
```

###### Arguments

- **str1**: String expression to compute Levenshtein distance with str2.
- **str2**: String expression to compute Levenshtein distance with str1.

###### Example

```sql
> select levenshtein('kitten', 'sitting');
+---------------------------------------------+
| levenshtein(Utf8("kitten"),Utf8("sitting")) |
+---------------------------------------------+
| 3                                           |
+---------------------------------------------+
```

##### `lower`

Converts a string to lower-case.

```sql
lower(str)
```

###### Arguments

- **str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> select lower('Ångström');
+-------------------------+
| lower(Utf8("Ångström")) |
+-------------------------+
| ångström                |
+-------------------------+
```

**Related functions**:

- [initcap](#initcap)
- [upper](#upper)

##### `lpad`

Pads the left side of a string with another string to a specified string length.

```sql
lpad(str, n[, padding_str])
```

###### Arguments

- **str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **n**: String length to pad to. If the input string is longer than this length, it is truncated (on the right).
- **padding_str**: Optional string expression to pad with. Can be a constant, column, or function, and any combination of string operators. _Default is a space._

###### Example

```sql
> select lpad('Dolly', 10, 'hello');
+---------------------------------------------+
| lpad(Utf8("Dolly"),Int64(10),Utf8("hello")) |
+---------------------------------------------+
| helloDolly                                  |
+---------------------------------------------+
```

**Related functions**:

- [rpad](#rpad)

##### `ltrim`

Trims the specified trim string from the beginning of a string. If no trim string is provided, spaces are removed from the start of the input string.

```sql
ltrim(str[, trim_str])
```

###### Arguments

- **str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **trim_str**: String expression to trim from the beginning of the input string. Can be a constant, column, or function, and any combination of arithmetic operators. _Default is a space._

###### Example

```sql
> select ltrim('  datafusion  ');
+-------------------------------+
| ltrim(Utf8("  datafusion  ")) |
+-------------------------------+
| datafusion                    |
+-------------------------------+
> select ltrim('___datafusion___', '_');
+-------------------------------------------+
| ltrim(Utf8("___datafusion___"),Utf8("_")) |
+-------------------------------------------+
| datafusion___                             |
+-------------------------------------------+
```

###### Alternative Syntax

```sql
trim(LEADING trim_str FROM str)
```

**Related functions**:

- [btrim](#btrim)
- [rtrim](#rtrim)

##### `octet_length`

Returns the length of a string in bytes.

```sql
octet_length(str)
```

###### Arguments

- **str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> select octet_length('Ångström');
+--------------------------------+
| octet_length(Utf8("Ångström")) |
+--------------------------------+
| 10                             |
+--------------------------------+
```

**Related functions**:

- [bit_length](#bit_length)
- [length](#length)

##### `overlay`

Returns the string which is replaced by another string from the specified position and specified count length.

```sql
overlay(str PLACING substr FROM pos [FOR count])
```

###### Arguments

- **str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **substr**: Substring to replace in str.
- **pos**: The start position to start the replace in str.
- **count**: The count of characters to be replaced from start position of str. If not specified, will use substr length instead.

###### Example

```sql
> select overlay('Txxxxas' placing 'hom' from 2 for 4);
+--------------------------------------------------------+
| overlay(Utf8("Txxxxas"),Utf8("hom"),Int64(2),Int64(4)) |
+--------------------------------------------------------+
| Thomas                                                 |
+--------------------------------------------------------+
```

##### `position`

_Alias of [strpos](#strpos)._

##### `repeat`

Returns a string with an input string repeated a specified number.

```sql
repeat(str, n)
```

###### Arguments

- **str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **n**: Number of times to repeat the input string.

###### Example

```sql
> select repeat('data', 3);
+-------------------------------+
| repeat(Utf8("data"),Int64(3)) |
+-------------------------------+
| datadatadata                  |
+-------------------------------+
```

##### `replace`

Replaces all occurrences of a specified substring in a string with a new substring.

```sql
replace(str, substr, replacement)
```

###### Arguments

- **str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **substr**: Substring expression to replace in the input string. Substring expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **replacement**: Replacement substring expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> select replace('ABabbaBA', 'ab', 'cd');
+-------------------------------------------------+
| replace(Utf8("ABabbaBA"),Utf8("ab"),Utf8("cd")) |
+-------------------------------------------------+
| ABcdbaBA                                        |
+-------------------------------------------------+
```

##### `reverse`

Reverses the character order of a string.

```sql
reverse(str)
```

###### Arguments

- **str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> select reverse('datafusion');
+-----------------------------+
| reverse(Utf8("datafusion")) |
+-----------------------------+
| noisufatad                  |
+-----------------------------+
```

##### `right`

Returns a specified number of characters from the right side of a string.

```sql
right(str, n)
```

###### Arguments

- **str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **n**: Number of characters to return.

###### Example

```sql
> select right('datafusion', 6);
+------------------------------------+
| right(Utf8("datafusion"),Int64(6)) |
+------------------------------------+
| fusion                             |
+------------------------------------+
```

**Related functions**:

- [left](#left)

##### `rpad`

Pads the right side of a string with another string to a specified string length.

```sql
rpad(str, n[, padding_str])
```

###### Arguments

- **str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **n**: String length to pad to. If the input string is longer than this length, it is truncated.
- **padding_str**: String expression to pad with. Can be a constant, column, or function, and any combination of string operators. _Default is a space._

###### Example

```sql
>  select rpad('datafusion', 20, '_-');
+-----------------------------------------------+
| rpad(Utf8("datafusion"),Int64(20),Utf8("_-")) |
+-----------------------------------------------+
| datafusion_-_-_-_-_-                          |
+-----------------------------------------------+
```

**Related functions**:

- [lpad](#lpad)

##### `rtrim`

Trims the specified trim string from the end of a string. If no trim string is provided, all spaces are removed from the end of the input string.

```sql
rtrim(str[, trim_str])
```

###### Arguments

- **str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **trim_str**: String expression to trim from the end of the input string. Can be a constant, column, or function, and any combination of arithmetic operators. _Default is a space._

###### Example

```sql
> select rtrim('  datafusion  ');
+-------------------------------+
| rtrim(Utf8("  datafusion  ")) |
+-------------------------------+
|   datafusion                  |
+-------------------------------+
> select rtrim('___datafusion___', '_');
+-------------------------------------------+
| rtrim(Utf8("___datafusion___"),Utf8("_")) |
+-------------------------------------------+
| ___datafusion                             |
+-------------------------------------------+
```

###### Alternative Syntax

```sql
trim(TRAILING trim_str FROM str)
```

**Related functions**:

- [btrim](#btrim)
- [ltrim](#ltrim)

##### `split_part`

Splits a string based on a specified delimiter and returns the substring in the specified position.

```sql
split_part(str, delimiter, pos)
```

###### Arguments

- **str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **delimiter**: String or character to split on.
- **pos**: Position of the part to return (counting from 1). Negative values count backward from the end of the string.

###### Example

```sql
> select split_part('1.2.3.4.5', '.', 3);
+--------------------------------------------------+
| split_part(Utf8("1.2.3.4.5"),Utf8("."),Int64(3)) |
+--------------------------------------------------+
| 3                                                |
+--------------------------------------------------+
```

##### `starts_with`

Tests if a string starts with a substring.

```sql
starts_with(str, substr)
```

###### Arguments

- **str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **substr**: Substring to test for.

###### Example

```sql
> select starts_with('datafusion','data');
+----------------------------------------------+
| starts_with(Utf8("datafusion"),Utf8("data")) |
+----------------------------------------------+
| true                                         |
+----------------------------------------------+
```

##### `strpos`

Returns the starting position of a specified substring in a string. Positions begin at 1. If the substring does not exist in the string, the function returns 0.

```sql
strpos(str, substr)
```

###### Arguments

- **str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **substr**: Substring expression to search for.

###### Example

```sql
> select strpos('datafusion', 'fus');
+----------------------------------------+
| strpos(Utf8("datafusion"),Utf8("fus")) |
+----------------------------------------+
| 5                                      |
+----------------------------------------+
```

###### Alternative Syntax

```sql
position(substr in origstr)
```

###### Aliases

- instr
- position

##### `substr`

Extracts a substring of a specified number of characters from a specific starting position in a string.

```sql
substr(str, start_pos[, length])
```

###### Arguments

- **str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **start_pos**: Character position to start the substring at. The first character in the string has a position of 1. If the start position is less than 1, it is treated as if it is before the start of the string and the (absolute) number of characters before position 1 is subtracted from `length` (if given). For example, `substr('abc', -3, 6)` returns `'ab'`.
- **length**: Number of characters to extract. If not specified, returns the rest of the string after the start position.

###### Example

```sql
> select substr('datafusion', 5, 3);
+----------------------------------------------+
| substr(Utf8("datafusion"),Int64(5),Int64(3)) |
+----------------------------------------------+
| fus                                          |
+----------------------------------------------+
```

###### Alternative Syntax

```sql
substring(str from start_pos for length)
```

###### Aliases

- substring

##### `substr_index`

Returns the substring from str before count occurrences of the delimiter delim.
If count is positive, everything to the left of the final delimiter (counting from the left) is returned.
If count is negative, everything to the right of the final delimiter (counting from the right) is returned.

```sql
substr_index(str, delim, count)
```

###### Arguments

- **str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **delim**: The string to find in str to split str.
- **count**: The number of times to search for the delimiter. Can be either a positive or negative number.

###### Example

```sql
> select substr_index('www.apache.org', '.', 1);
+---------------------------------------------------------+
| substr_index(Utf8("www.apache.org"),Utf8("."),Int64(1)) |
+---------------------------------------------------------+
| www                                                     |
+---------------------------------------------------------+
> select substr_index('www.apache.org', '.', -1);
+----------------------------------------------------------+
| substr_index(Utf8("www.apache.org"),Utf8("."),Int64(-1)) |
+----------------------------------------------------------+
| org                                                      |
+----------------------------------------------------------+
```

###### Aliases

- substring_index

##### `substring`

_Alias of [substr](#substr)._

##### `substring_index`

_Alias of [substr_index](#substr_index)._

##### `to_hex`

Converts an integer to a hexadecimal string.

```sql
to_hex(int)
```

###### Arguments

- **int**: Integer expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> select to_hex(12345689);
+-------------------------+
| to_hex(Int64(12345689)) |
+-------------------------+
| bc6159                  |
+-------------------------+
```

##### `translate`

Performs character-wise substitution based on a mapping.

```sql
translate(str, from, to)
```

###### Arguments

- **str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **from**: The characters to be replaced.
- **to**: The characters to replace them with. Each character in **from** that is found in **str** is replaced by the character at the same index in **to**. Any characters in **from** that don't have a corresponding character in **to** are removed. If a character appears more than once in **from**, the first occurrence determines the mapping.

###### Example

```sql
> select translate('twice', 'wic', 'her');
+--------------------------------------------------+
| translate(Utf8("twice"),Utf8("wic"),Utf8("her")) |
+--------------------------------------------------+
| there                                            |
+--------------------------------------------------+
```

##### `trim`

_Alias of [btrim](#btrim)._

##### `upper`

Converts a string to upper-case.

```sql
upper(str)
```

###### Arguments

- **str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> select upper('dataFusion');
+---------------------------+
| upper(Utf8("dataFusion")) |
+---------------------------+
| DATAFUSION                |
+---------------------------+
```

**Related functions**:

- [initcap](#initcap)
- [lower](#lower)

##### `uuid`

Returns [`UUID v4`](https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_4_%28random%29) string value which is unique per row.

```sql
uuid()
```

###### Example

```sql
> select uuid();
+--------------------------------------+
| uuid()                               |
+--------------------------------------+
| 6ec17ef8-1934-41cc-8d59-d0c8f9eea1f0 |
+--------------------------------------+
```

### Binary String Functions

- [decode](#decode)
- [encode](#encode)

##### `decode`

Decode binary data from textual representation in string.

```sql
decode(expression, format)
```

###### Arguments

- **expression**: Expression containing encoded string data
- **format**: Same arguments as [encode](#encode)

**Related functions**:

- [encode](#encode)

##### `encode`

Encode binary data into a textual representation.

```sql
encode(expression, format)
```

###### Arguments

- **expression**: Expression containing string or binary data
- **format**: Supported formats are: `base64`, `base64pad`, `hex`

**Related functions**:

- [decode](#decode)

### Regular Expression Functions

Apache DataFusion uses a [PCRE-like](https://en.wikibooks.org/wiki/Regular_Expressions/Perl-Compatible_Regular_Expressions)
regular expression [syntax](https://docs.rs/regex/latest/regex/#syntax)
(minus support for several features including look-around and backreferences).
The following regular expression functions are supported:

- [regexp_count](#regexp_count)
- [regexp_instr](#regexp_instr)
- [regexp_like](#regexp_like)
- [regexp_match](#regexp_match)
- [regexp_replace](#regexp_replace)

##### `regexp_count`

Returns the number of matches that a [regular expression](https://docs.rs/regex/latest/regex/#syntax) has in a string.

```sql
regexp_count(str, regexp[, start, flags])
```

###### Arguments

- **str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **regexp**: Regular expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **start**: - **start**: Optional start position (the first position is 1) to search for the regular expression. Can be a constant, column, or function.
- **flags**: Optional regular expression flags that control the behavior of the regular expression. The following flags are supported:
  - **i**: case-insensitive: letters match both upper and lower case
  - **m**: multi-line mode: ^ and $ match begin/end of line
  - **s**: allow . to match \n
  - **R**: enables CRLF mode: when multi-line mode is enabled, \r\n is used
  - **U**: swap the meaning of x* and x*?

###### Example

```sql
> select regexp_count('abcAbAbc', 'abc', 2, 'i');
+---------------------------------------------------------------+
| regexp_count(Utf8("abcAbAbc"),Utf8("abc"),Int64(2),Utf8("i")) |
+---------------------------------------------------------------+
| 1                                                             |
+---------------------------------------------------------------+
```

##### `regexp_instr`

Returns the position in a string where the specified occurrence of a POSIX regular expression is located.

```sql
regexp_instr(str, regexp[, start[, N[, flags[, subexpr]]]])
```

###### Arguments

- **str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **regexp**: Regular expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **start**: - **start**: Optional start position (the first position is 1) to search for the regular expression. Can be a constant, column, or function. Defaults to 1
- **N**: - **N**: Optional The N-th occurrence of pattern to find. Defaults to 1 (first match). Can be a constant, column, or function.
- **flags**: Optional regular expression flags that control the behavior of the regular expression. The following flags are supported:
  - **i**: case-insensitive: letters match both upper and lower case
  - **m**: multi-line mode: ^ and $ match begin/end of line
  - **s**: allow . to match \n
  - **R**: enables CRLF mode: when multi-line mode is enabled, \r\n is used
  - **U**: swap the meaning of x* and x*?
- **subexpr**: Optional Specifies which capture group (subexpression) to return the position for. Defaults to 0, which returns the position of the entire match.

###### Example

```sql
> SELECT regexp_instr('ABCDEF', 'C(.)(..)');
+---------------------------------------------------------------+
| regexp_instr(Utf8("ABCDEF"),Utf8("C(.)(..)"))                 |
+---------------------------------------------------------------+
| 3                                                             |
+---------------------------------------------------------------+
```

##### `regexp_like`

Returns true if a [regular expression](https://docs.rs/regex/latest/regex/#syntax) has at least one match in a string, false otherwise.

```sql
regexp_like(str, regexp[, flags])
```

###### Arguments

- **str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **regexp**: Regular expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **flags**: Optional regular expression flags that control the behavior of the regular expression. The following flags are supported:
  - **i**: case-insensitive: letters match both upper and lower case
  - **m**: multi-line mode: ^ and $ match begin/end of line
  - **s**: allow . to match \n
  - **R**: enables CRLF mode: when multi-line mode is enabled, \r\n is used
  - **U**: swap the meaning of x* and x*?

###### Example

```sql
select regexp_like('Köln', '[a-zA-Z]ö[a-zA-Z]{2}');
+--------------------------------------------------------+
| regexp_like(Utf8("Köln"),Utf8("[a-zA-Z]ö[a-zA-Z]{2}")) |
+--------------------------------------------------------+
| true                                                   |
+--------------------------------------------------------+
SELECT regexp_like('aBc', '(b|d)', 'i');
+--------------------------------------------------+
| regexp_like(Utf8("aBc"),Utf8("(b|d)"),Utf8("i")) |
+--------------------------------------------------+
| true                                             |
+--------------------------------------------------+
```

Additional examples can be found [here](https://github.com/apache/datafusion/blob/main/datafusion-examples/examples/builtin_functions/regexp.rs)

##### `regexp_match`

Returns the first [regular expression](https://docs.rs/regex/latest/regex/#syntax) matches in a string.

```sql
regexp_match(str, regexp[, flags])
```

###### Arguments

- **str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **regexp**: Regular expression to match against.
  Can be a constant, column, or function.
- **flags**: Optional regular expression flags that control the behavior of the regular expression. The following flags are supported:
  - **i**: case-insensitive: letters match both upper and lower case
  - **m**: multi-line mode: ^ and $ match begin/end of line
  - **s**: allow . to match \n
  - **R**: enables CRLF mode: when multi-line mode is enabled, \r\n is used
  - **U**: swap the meaning of x* and x*?

###### Example

```sql
            > select regexp_match('Köln', '[a-zA-Z]ö[a-zA-Z]{2}');
            +---------------------------------------------------------+
            | regexp_match(Utf8("Köln"),Utf8("[a-zA-Z]ö[a-zA-Z]{2}")) |
            +---------------------------------------------------------+
            | [Köln]                                                  |
            +---------------------------------------------------------+
            SELECT regexp_match('aBc', '(b|d)', 'i');
            +---------------------------------------------------+
            | regexp_match(Utf8("aBc"),Utf8("(b|d)"),Utf8("i")) |
            +---------------------------------------------------+
            | [B]                                               |
            +---------------------------------------------------+
```

Additional examples can be found [here](https://github.com/apache/datafusion/blob/main/datafusion-examples/examples/builtin_functions/regexp.rs)

##### `regexp_replace`

Replaces substrings in a string that match a [regular expression](https://docs.rs/regex/latest/regex/#syntax).

```sql
regexp_replace(str, regexp, replacement[, flags])
```

###### Arguments

- **str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **regexp**: Regular expression to match against.
  Can be a constant, column, or function.
- **replacement**: Replacement string expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **flags**: Optional regular expression flags that control the behavior of the regular expression. The following flags are supported:
- **g**: (global) Search globally and don't return after the first match
- **i**: case-insensitive: letters match both upper and lower case
- **m**: multi-line mode: ^ and $ match begin/end of line
- **s**: allow . to match \n
- **R**: enables CRLF mode: when multi-line mode is enabled, \r\n is used
- **U**: swap the meaning of x* and x*?

###### Example

```sql
> select regexp_replace('foobarbaz', 'b(..)', 'X\\1Y', 'g');
+------------------------------------------------------------------------+
| regexp_replace(Utf8("foobarbaz"),Utf8("b(..)"),Utf8("X\1Y"),Utf8("g")) |
+------------------------------------------------------------------------+
| fooXarYXazY                                                            |
+------------------------------------------------------------------------+
SELECT regexp_replace('aBc', '(b|d)', 'Ab\\1a', 'i');
+-------------------------------------------------------------------+
| regexp_replace(Utf8("aBc"),Utf8("(b|d)"),Utf8("Ab\1a"),Utf8("i")) |
+-------------------------------------------------------------------+
| aAbBac                                                            |
+-------------------------------------------------------------------+
```

Additional examples can be found [here](https://github.com/apache/datafusion/blob/main/datafusion-examples/examples/builtin_functions/regexp.rs)

### Time and Date Functions

- [current_date](#current_date)
- [current_time](#current_time)
- [current_timestamp](#current_timestamp)
- [date_bin](#date_bin)
- [date_format](#date_format)
- [date_part](#date_part)
- [date_trunc](#date_trunc)
- [datepart](#datepart)
- [datetrunc](#datetrunc)
- [from_unixtime](#from_unixtime)
- [make_date](#make_date)
- [make_time](#make_time)
- [now](#now)
- [to_char](#to_char)
- [to_date](#to_date)
- [to_local_time](#to_local_time)
- [to_time](#to_time)
- [to_timestamp](#to_timestamp)
- [to_timestamp_micros](#to_timestamp_micros)
- [to_timestamp_millis](#to_timestamp_millis)
- [to_timestamp_nanos](#to_timestamp_nanos)
- [to_timestamp_seconds](#to_timestamp_seconds)
- [to_unixtime](#to_unixtime)
- [today](#today)

##### `current_date`

Returns the current date in the session time zone.

The `current_date()` return value is determined at query time and will return the same date, no matter when in the query plan the function executes.

```sql
current_date()
    (optional) SET datafusion.execution.time_zone = '+00:00';
    SELECT current_date();
```

###### Example

```sql
> SELECT current_date();
+----------------+
| current_date() |
+----------------+
| 2024-12-23     |
+----------------+

-- The current date is based on the session time zone (UTC by default)
> SET datafusion.execution.time_zone = 'Asia/Tokyo';
> SELECT current_date();
+----------------+
| current_date() |
+----------------+
| 2024-12-24     |
+----------------+
```

###### Aliases

- today

##### `current_time`

Returns the current time in the session time zone.

The `current_time()` return value is determined at query time and will return the same time, no matter when in the query plan the function executes.

The session time zone can be set using the statement 'SET datafusion.execution.time_zone = desired time zone'. The time zone can be a value like +00:00, 'Europe/London' etc.

```sql
current_time()
    (optional) SET datafusion.execution.time_zone = '+00:00';
    SELECT current_time();
```

###### Example

```sql
> SELECT current_time();
+--------------------+
| current_time()     |
+--------------------+
| 06:30:00.123456789 |
+--------------------+

-- The current time is based on the session time zone (UTC by default)
> SET datafusion.execution.time_zone = 'Asia/Tokyo';
> SELECT current_time();
+--------------------+
| current_time()     |
+--------------------+
| 15:30:00.123456789 |
+--------------------+
```

##### `current_timestamp`

_Alias of [now](#now)._

##### `date_bin`

Calculates time intervals and returns the start of the interval nearest to the specified timestamp. Use `date_bin` to downsample time series data by grouping rows into time-based "bins" or "windows" and applying an aggregate or selector function to each window.

For example, if you "bin" or "window" data into 15 minute intervals, an input timestamp of `2023-01-01T18:18:18Z` will be updated to the start time of the 15 minute bin it is in: `2023-01-01T18:15:00Z`.

```sql
date_bin(interval, expression, origin-timestamp)
```

###### Arguments

- **interval**: Bin interval.
- **expression**: Time expression to operate on. Can be a constant, column, or function.
- **origin-timestamp**: Optional. Starting point used to determine bin boundaries. If not specified defaults 1970-01-01T00:00:00Z (the UNIX epoch in UTC). The following intervals are supported:

  - nanoseconds
  - microseconds
  - milliseconds
  - seconds
  - minutes
  - hours
  - days
  - weeks
  - months
  - years
  - century

###### Example

```sql
-- Bin the timestamp into 1 day intervals
> SELECT date_bin(interval '1 day', time) as bin
FROM VALUES ('2023-01-01T18:18:18Z'), ('2023-01-03T19:00:03Z')  t(time);
+---------------------+
| bin                 |
+---------------------+
| 2023-01-01T00:00:00 |
| 2023-01-03T00:00:00 |
+---------------------+
2 row(s) fetched.

-- Bin the timestamp into 1 day intervals starting at 3AM on  2023-01-01
> SELECT date_bin(interval '1 day', time,  '2023-01-01T03:00:00') as bin
FROM VALUES ('2023-01-01T18:18:18Z'), ('2023-01-03T19:00:03Z')  t(time);
+---------------------+
| bin                 |
+---------------------+
| 2023-01-01T03:00:00 |
| 2023-01-03T03:00:00 |
+---------------------+
2 row(s) fetched.

-- Bin the time into 15 minute intervals starting at 1 min
>  SELECT date_bin(interval '15 minutes', time, TIME '00:01:00') as bin
FROM VALUES (TIME '02:18:18'), (TIME '19:00:03')  t(time);
+----------+
| bin      |
+----------+
| 02:16:00 |
| 18:46:00 |
+----------+
2 row(s) fetched.
```

##### `date_format`

_Alias of [to_char](#to_char)._

##### `date_part`

Returns the specified part of the date as an integer.

```sql
date_part(part, expression)
```

###### Arguments

- **part**: Part of the date to return. The following date parts are supported:

  - year
  - isoyear (ISO 8601 week-numbering year)
  - quarter (emits value in inclusive range [1, 4] based on which quartile of the year the date is in)
  - month
  - week (week of the year)
  - day (day of the month)
  - hour
  - minute
  - second
  - millisecond
  - microsecond
  - nanosecond
  - dow (day of the week where Sunday is 0)
  - doy (day of the year)
  - epoch (seconds since Unix epoch for timestamps/dates, total seconds for intervals)
  - isodow (day of the week where Monday is 0)

- **expression**: Time expression to operate on. Can be a constant, column, or function.

###### Example

```sql
> SELECT date_part('year', '2024-05-01T00:00:00');
+-----------------------------------------------------+
| date_part(Utf8("year"),Utf8("2024-05-01T00:00:00")) |
+-----------------------------------------------------+
| 2024                                                |
+-----------------------------------------------------+
> SELECT extract(day FROM timestamp '2024-05-01T00:00:00');
+----------------------------------------------------+
| date_part(Utf8("DAY"),Utf8("2024-05-01T00:00:00")) |
+----------------------------------------------------+
| 1                                                  |
+----------------------------------------------------+
```

###### Alternative Syntax

```sql
extract(field FROM source)
```

###### Aliases

- datepart

##### `date_trunc`

Truncates a timestamp or time value to a specified precision.

```sql
date_trunc(precision, expression)
```

###### Arguments

- **precision**: Time precision to truncate to. The following precisions are supported:

  For Timestamp types:

  - year / YEAR
  - quarter / QUARTER
  - month / MONTH
  - week / WEEK
  - day / DAY
  - hour / HOUR
  - minute / MINUTE
  - second / SECOND
  - millisecond / MILLISECOND
  - microsecond / MICROSECOND

  For Time types (hour, minute, second, millisecond, microsecond only):

  - hour / HOUR
  - minute / MINUTE
  - second / SECOND
  - millisecond / MILLISECOND
  - microsecond / MICROSECOND

- **expression**: Timestamp or time expression to operate on. Can be a constant, column, or function.

###### Example

```sql
> SELECT date_trunc('month', '2024-05-15T10:30:00');
+-----------------------------------------------+
| date_trunc(Utf8("month"),Utf8("2024-05-15T10:30:00")) |
+-----------------------------------------------+
| 2024-05-01T00:00:00                           |
+-----------------------------------------------+
> SELECT date_trunc('hour', '2024-05-15T10:30:00');
+----------------------------------------------+
| date_trunc(Utf8("hour"),Utf8("2024-05-15T10:30:00")) |
+----------------------------------------------+
| 2024-05-15T10:00:00                          |
+----------------------------------------------+
```

###### Aliases

- datetrunc

##### `datepart`

_Alias of [date_part](#date_part)._

##### `datetrunc`

_Alias of [date_trunc](#date_trunc)._

##### `from_unixtime`

Converts an integer to RFC3339 timestamp format (`YYYY-MM-DDT00:00:00.000000000Z`). Integers and unsigned integers are interpreted as seconds since the unix epoch (`1970-01-01T00:00:00Z`) return the corresponding timestamp.

```sql
from_unixtime(expression[, timezone])
```

###### Arguments

- **expression**: The expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **timezone**: Optional timezone to use when converting the integer to a timestamp. If not provided, the default timezone is UTC.

###### Example

```sql
> select from_unixtime(1599572549, 'America/New_York');
+-----------------------------------------------------------+
| from_unixtime(Int64(1599572549),Utf8("America/New_York")) |
+-----------------------------------------------------------+
| 2020-09-08T09:42:29-04:00                                 |
+-----------------------------------------------------------+
```

##### `make_date`

Make a date from year/month/day component parts.

```sql
make_date(year, month, day)
```

###### Arguments

- **year**: Year to use when making the date. Can be a constant, column or function, and any combination of arithmetic operators.
- **month**: Month to use when making the date. Can be a constant, column or function, and any combination of arithmetic operators.
- **day**: Day to use when making the date. Can be a constant, column or function, and any combination of arithmetic operators.

###### Example

```sql
> select make_date(2023, 1, 31);
+-------------------------------------------+
| make_date(Int64(2023),Int64(1),Int64(31)) |
+-------------------------------------------+
| 2023-01-31                                |
+-------------------------------------------+
> select make_date('2023', '01', '31');
+-----------------------------------------------+
| make_date(Utf8("2023"),Utf8("01"),Utf8("31")) |
+-----------------------------------------------+
| 2023-01-31                                    |
+-----------------------------------------------+
```

Additional examples can be found [here](https://github.com/apache/datafusion/blob/main/datafusion-examples/examples/builtin_functions/date_time.rs)

##### `make_time`

Make a time from hour/minute/second component parts.

```sql
make_time(hour, minute, second)
```

###### Arguments

- **hour**: Hour to use when making the time. Can be a constant, column or function, and any combination of arithmetic operators.
- **minute**: Minute to use when making the time. Can be a constant, column or function, and any combination of arithmetic operators.
- **second**: Second to use when making the time. Can be a constant, column or function, and any combination of arithmetic operators.

###### Example

```sql
> select make_time(13, 23, 1);
+-------------------------------------------+
| make_time(Int64(13),Int64(23),Int64(1))   |
+-------------------------------------------+
| 13:23:01                                  |
+-------------------------------------------+
> select make_time('23', '01', '31');
+-----------------------------------------------+
| make_time(Utf8("23"),Utf8("01"),Utf8("31"))   |
+-----------------------------------------------+
| 23:01:31                                      |
+-----------------------------------------------+
```

Additional examples can be found [here](https://github.com/apache/datafusion/blob/main/datafusion-examples/examples/builtin_functions/date_time.rs)

##### `now`

Returns the current timestamp in the system configured timezone (None by default).

The `now()` return value is determined at query time and will return the same timestamp, no matter when in the query plan the function executes.

```sql
now()
```

###### Example

```sql
> SELECT now();
+----------------------------------+
| now()                            |
+----------------------------------+
| 2024-12-23T06:30:00.123456789    |
+----------------------------------+

-- The timezone of the returned timestamp depends on the session time zone
> SET datafusion.execution.time_zone = 'America/New_York';
> SELECT now();
+--------------------------------------+
| now()                                |
+--------------------------------------+
| 2024-12-23T01:30:00.123456789-05:00  |
+--------------------------------------+
```

###### Aliases

- current_timestamp

##### `to_char`

Returns a string representation of a date, time, timestamp or duration based on a [Chrono format](https://docs.rs/chrono/latest/chrono/format/strftime/index.html). Unlike the PostgreSQL equivalent of this function numerical formatting is not supported.

```sql
to_char(expression, format)
```

###### Arguments

- **expression**: Expression to operate on. Can be a constant, column, or function that results in a date, time, timestamp or duration.
- **format**: A [Chrono format](https://docs.rs/chrono/latest/chrono/format/strftime/index.html) string to use to convert the expression.
- **day**: Day to use when making the date. Can be a constant, column or function, and any combination of arithmetic operators.

###### Example

```sql
> select to_char('2023-03-01'::date, '%d-%m-%Y');
+----------------------------------------------+
| to_char(Utf8("2023-03-01"),Utf8("%d-%m-%Y")) |
+----------------------------------------------+
| 01-03-2023                                   |
+----------------------------------------------+
```

Additional examples can be found [here](https://github.com/apache/datafusion/blob/main/datafusion-examples/examples/builtin_functions/date_time.rs)

###### Aliases

- date_format

##### `to_date`

Converts a value to a date (`YYYY-MM-DD`).
Supports strings, numeric and timestamp types as input.
Strings are parsed as YYYY-MM-DD (e.g. '2023-07-20') if no [Chrono format](https://docs.rs/chrono/latest/chrono/format/strftime/index.html)s are provided.
Integers and doubles are interpreted as days since the unix epoch (`1970-01-01T00:00:00Z`).
Returns the corresponding date.

Note: `to_date` returns Date32, which represents its values as the number of days since unix epoch(`1970-01-01`) stored as signed 32 bit value. The largest supported date value is `9999-12-31`.

```sql
to_date('2017-05-31', '%Y-%m-%d')
```

###### Arguments

- **expression**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **format_n**: Optional [Chrono format](https://docs.rs/chrono/latest/chrono/format/strftime/index.html) strings to use to parse the expression. Formats will be tried in the order
  they appear with the first successful one being returned. If none of the formats successfully parse the expression
  an error will be returned.

###### Example

```sql
> select to_date('2023-01-31');
+-------------------------------+
| to_date(Utf8("2023-01-31")) |
+-------------------------------+
| 2023-01-31                    |
+-------------------------------+
> select to_date('2023/01/31', '%Y-%m-%d', '%Y/%m/%d');
+---------------------------------------------------------------------+
| to_date(Utf8("2023/01/31"),Utf8("%Y-%m-%d"),Utf8("%Y/%m/%d")) |
+---------------------------------------------------------------------+
| 2023-01-31                                                          |
+---------------------------------------------------------------------+
```

Additional examples can be found [here](https://github.com/apache/datafusion/blob/main/datafusion-examples/examples/builtin_functions/date_time.rs)

##### `to_local_time`

Converts a timestamp with a timezone to a timestamp without a timezone (with no offset or timezone information). This function handles daylight saving time changes.

```sql
to_local_time(expression)
```

###### Arguments

- **expression**: Time expression to operate on. Can be a constant, column, or function.

###### Example

```sql
> SELECT to_local_time('2024-04-01T00:00:20Z'::timestamp);
+---------------------------------------------+
| to_local_time(Utf8("2024-04-01T00:00:20Z")) |
+---------------------------------------------+
| 2024-04-01T00:00:20                         |
+---------------------------------------------+

> SELECT to_local_time('2024-04-01T00:00:20Z'::timestamp AT TIME ZONE 'Europe/Brussels');
+---------------------------------------------+
| to_local_time(Utf8("2024-04-01T00:00:20Z")) |
+---------------------------------------------+
| 2024-04-01T00:00:20                         |
+---------------------------------------------+

> SELECT
  time,
  arrow_typeof(time) as type,
  to_local_time(time) as to_local_time,
  arrow_typeof(to_local_time(time)) as to_local_time_type
FROM (
  SELECT '2024-04-01T00:00:20Z'::timestamp AT TIME ZONE 'Europe/Brussels' AS time
);
+---------------------------+----------------------------------+---------------------+--------------------+
| time                      | type                             | to_local_time       | to_local_time_type |
+---------------------------+----------------------------------+---------------------+--------------------+
| 2024-04-01T00:00:20+02:00 | Timestamp(ns, "Europe/Brussels") | 2024-04-01T00:00:20 | Timestamp(ns)      |
+---------------------------+----------------------------------+---------------------+--------------------+

## combine `to_local_time()` with `date_bin()` to bin on boundaries in the timezone rather
## than UTC boundaries

> SELECT date_bin(interval '1 day', to_local_time('2024-04-01T00:00:20Z'::timestamp AT TIME ZONE 'Europe/Brussels')) AS date_bin;
+---------------------+
| date_bin            |
+---------------------+
| 2024-04-01T00:00:00 |
+---------------------+

> SELECT date_bin(interval '1 day', to_local_time('2024-04-01T00:00:20Z'::timestamp AT TIME ZONE 'Europe/Brussels')) AT TIME ZONE 'Europe/Brussels' AS date_bin_with_timezone;
+---------------------------+
| date_bin_with_timezone    |
+---------------------------+
| 2024-04-01T00:00:00+02:00 |
+---------------------------+
```

##### `to_time`

Converts a value to a time (`HH:MM:SS.nnnnnnnnn`).
Supports strings and timestamps as input.
Strings are parsed as `HH:MM:SS`, `HH:MM:SS.nnnnnnnnn`, or `HH:MM` if no [Chrono format](https://docs.rs/chrono/latest/chrono/format/strftime/index.html)s are provided.
Timestamps will have the time portion extracted.
Returns the corresponding time.

Note: `to_time` returns Time64(Nanosecond), which represents the time of day in nanoseconds since midnight.

```sql
to_time('12:30:45', '%H:%M:%S')
```

###### Arguments

- **expression**: String or Timestamp expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **format_n**: Optional [Chrono format](https://docs.rs/chrono/latest/chrono/format/strftime/index.html) strings to use to parse the expression. Formats will be tried in the order
  they appear with the first successful one being returned. If none of the formats successfully parse the expression
  an error will be returned.

###### Example

```sql
> select to_time('12:30:45');
+---------------------------+
| to_time(Utf8("12:30:45")) |
+---------------------------+
| 12:30:45                  |
+---------------------------+
> select to_time('12-30-45', '%H-%M-%S');
+--------------------------------------------+
| to_time(Utf8("12-30-45"),Utf8("%H-%M-%S")) |
+--------------------------------------------+
| 12:30:45                                   |
+--------------------------------------------+
> select to_time('2024-01-15 14:30:45'::timestamp);
+--------------------------------------------------+
| to_time(Utf8("2024-01-15 14:30:45"))             |
+--------------------------------------------------+
| 14:30:45                                         |
+--------------------------------------------------+
```

Additional examples can be found [here](https://github.com/apache/datafusion/blob/main/datafusion-examples/examples/builtin_functions/date_time.rs)

##### `to_timestamp`

Converts a value to a timestamp (`YYYY-MM-DDT00:00:00.000000<TZ>`) in the session time zone. Supports strings,
integer, unsigned integer, and double types as input. Strings are parsed as RFC3339 (e.g. '2023-07-20T05:44:00')
if no [Chrono formats](https://docs.rs/chrono/latest/chrono/format/strftime/index.html) are provided.
Strings that parse without a time zone are treated as if they are in the
session time zone, or UTC if no session time zone is set.
Integers, unsigned integers, and doubles are interpreted as seconds since the unix epoch (`1970-01-01T00:00:00Z`).

Note: `to_timestamp` returns `Timestamp(ns, TimeZone)` where the time zone is the session time zone. The supported range
for integer input is between`-9223372037` and `9223372036`. Supported range for string input is between
`1677-09-21T00:12:44.0` and `2262-04-11T23:47:16.0`. Please use `to_timestamp_seconds`
for the input outside of supported bounds.

The session time zone can be set using the statement `SET TIMEZONE = 'desired time zone'`.
The time zone can be a value like +00:00, 'Europe/London' etc.

```sql
to_timestamp(expression[, ..., format_n])
```

###### Arguments

- **expression**: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
- **format_n**:
  Optional [Chrono format](https://docs.rs/chrono/latest/chrono/format/strftime/index.html) strings to use to parse the expression.
  Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully
  parse the expression an error will be returned. Note: parsing of named timezones (e.g. 'America/New_York') using %Z is
  only supported at the end of the string preceded by a space.

###### Example

```sql
> select to_timestamp('2023-01-31T09:26:56.123456789-05:00');
+-----------------------------------------------------------+
| to_timestamp(Utf8("2023-01-31T09:26:56.123456789-05:00")) |
+-----------------------------------------------------------+
| 2023-01-31T14:26:56.123456789                             |
+-----------------------------------------------------------+
> select to_timestamp('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y');
+--------------------------------------------------------------------------------------------------------+
| to_timestamp(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) |
+--------------------------------------------------------------------------------------------------------+
| 2023-05-17T03:59:00.123456789                                                                          |
+--------------------------------------------------------------------------------------------------------+
```

Additional examples can be found [here](https://github.com/apache/datafusion/blob/main/datafusion-examples/examples/builtin_functions/date_time.rs)

##### `to_timestamp_micros`

Converts a value to a timestamp (`YYYY-MM-DDT00:00:00.000000<TZ>`) in the session time zone. Supports strings,
integer, unsigned integer, and double types as input. Strings are parsed as RFC3339 (e.g. '2023-07-20T05:44:00')
if no [Chrono formats](https://docs.rs/chrono/latest/chrono/format/strftime/index.html) are provided.
Strings that parse without a time zone are treated as if they are in the
session time zone, or UTC if no session time zone is set.
Integers, unsigned integers, and doubles are interpreted as microseconds since the unix epoch (`1970-01-01T00:00:00Z`).

The session time zone can be set using the statement `SET TIMEZONE = 'desired time zone'`.
The time zone can be a value like +00:00, 'Europe/London' etc.

```sql
to_timestamp_micros(expression[, ..., format_n])
```

###### Arguments

- **expression**: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
- **format_n**:
  Optional [Chrono format](https://docs.rs/chrono/latest/chrono/format/strftime/index.html) strings to use to parse the expression.
  Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully
  parse the expression an error will be returned. Note: parsing of named timezones (e.g. 'America/New_York') using %Z is
  only supported at the end of the string preceded by a space.

###### Example

```sql
> select to_timestamp_micros('2023-01-31T09:26:56.123456789-05:00');
+------------------------------------------------------------------+
| to_timestamp_micros(Utf8("2023-01-31T09:26:56.123456789-05:00")) |
+------------------------------------------------------------------+
| 2023-01-31T14:26:56.123456                                       |
+------------------------------------------------------------------+
> select to_timestamp_micros('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y');
+---------------------------------------------------------------------------------------------------------------+
| to_timestamp_micros(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) |
+---------------------------------------------------------------------------------------------------------------+
| 2023-05-17T03:59:00.123456                                                                                    |
+---------------------------------------------------------------------------------------------------------------+
```

Additional examples can be found [here](https://github.com/apache/datafusion/blob/main/datafusion-examples/examples/builtin_functions/date_time.rs)

##### `to_timestamp_millis`

Converts a value to a timestamp (`YYYY-MM-DDT00:00:00.000<TZ>`) in the session time zone. Supports strings,
integer, unsigned integer, and double types as input. Strings are parsed as RFC3339 (e.g. '2023-07-20T05:44:00')
if no [Chrono formats](https://docs.rs/chrono/latest/chrono/format/strftime/index.html) are provided.
Strings that parse without a time zone are treated as if they are in the
session time zone, or UTC if no session time zone is set.
Integers, unsigned integers, and doubles are interpreted as milliseconds since the unix epoch (`1970-01-01T00:00:00Z`).

The session time zone can be set using the statement `SET TIMEZONE = 'desired time zone'`.
The time zone can be a value like +00:00, 'Europe/London' etc.

```sql
to_timestamp_millis(expression[, ..., format_n])
```

###### Arguments

- **expression**: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
- **format_n**:
  Optional [Chrono format](https://docs.rs/chrono/latest/chrono/format/strftime/index.html) strings to use to parse the expression.
  Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully
  parse the expression an error will be returned. Note: parsing of named timezones (e.g. 'America/New_York') using %Z is
  only supported at the end of the string preceded by a space.

###### Example

```sql
> select to_timestamp_millis('2023-01-31T09:26:56.123456789-05:00');
+------------------------------------------------------------------+
| to_timestamp_millis(Utf8("2023-01-31T09:26:56.123456789-05:00")) |
+------------------------------------------------------------------+
| 2023-01-31T14:26:56.123                                          |
+------------------------------------------------------------------+
> select to_timestamp_millis('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y');
+---------------------------------------------------------------------------------------------------------------+
| to_timestamp_millis(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) |
+---------------------------------------------------------------------------------------------------------------+
| 2023-05-17T03:59:00.123                                                                                       |
+---------------------------------------------------------------------------------------------------------------+
```

Additional examples can be found [here](https://github.com/apache/datafusion/blob/main/datafusion-examples/examples/builtin_functions/date_time.rs)

##### `to_timestamp_nanos`

Converts a value to a timestamp (`YYYY-MM-DDT00:00:00.000000000<TZ>`) in the session time zone. Supports strings,
integer, unsigned integer, and double types as input. Strings are parsed as RFC3339 (e.g. '2023-07-20T05:44:00')
if no [Chrono formats](https://docs.rs/chrono/latest/chrono/format/strftime/index.html) are provided.
Strings that parse without a time zone are treated as if they are in the
session time zone. Integers, unsigned integers, and doubles are interpreted as nanoseconds since the unix epoch (`1970-01-01T00:00:00Z`).

The session time zone can be set using the statement `SET TIMEZONE = 'desired time zone'`.
The time zone can be a value like +00:00, 'Europe/London' etc.

```sql
to_timestamp_nanos(expression[, ..., format_n])
```

###### Arguments

- **expression**: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
- **format_n**:
  Optional [Chrono format](https://docs.rs/chrono/latest/chrono/format/strftime/index.html) strings to use to parse the expression.
  Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully
  parse the expression an error will be returned. Note: parsing of named timezones (e.g. 'America/New_York') using %Z is
  only supported at the end of the string preceded by a space.

###### Example

```sql
> select to_timestamp_nanos('2023-01-31T09:26:56.123456789-05:00');
+-----------------------------------------------------------------+
| to_timestamp_nanos(Utf8("2023-01-31T09:26:56.123456789-05:00")) |
+-----------------------------------------------------------------+
| 2023-01-31T14:26:56.123456789                                   |
+-----------------------------------------------------------------+
> select to_timestamp_nanos('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y');
+--------------------------------------------------------------------------------------------------------------+
| to_timestamp_nanos(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) |
+--------------------------------------------------------------------------------------------------------------+
| 2023-05-17T03:59:00.123456789                                                                                |
+---------------------------------------------------------------------------------------------------------------+
```

Additional examples can be found [here](https://github.com/apache/datafusion/blob/main/datafusion-examples/examples/builtin_functions/date_time.rs)

##### `to_timestamp_seconds`

Converts a value to a timestamp (`YYYY-MM-DDT00:00:00<TZ>`) in the session time zone. Supports strings,
integer, unsigned integer, and double types as input. Strings are parsed as RFC3339 (e.g. '2023-07-20T05:44:00')
if no [Chrono formats](https://docs.rs/chrono/latest/chrono/format/strftime/index.html) are provided.
Strings that parse without a time zone are treated as if they are in the
session time zone, or UTC if no session time zone is set.
Integers, unsigned integers, and doubles are interpreted as seconds since the unix epoch (`1970-01-01T00:00:00Z`).

The session time zone can be set using the statement `SET TIMEZONE = 'desired time zone'`.
The time zone can be a value like +00:00, 'Europe/London' etc.

```sql
to_timestamp_seconds(expression[, ..., format_n])
```

###### Arguments

- **expression**: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
- **format_n**:
  Optional [Chrono format](https://docs.rs/chrono/latest/chrono/format/strftime/index.html) strings to use to parse the expression.
  Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully
  parse the expression an error will be returned. Note: parsing of named timezones (e.g. 'America/New_York') using %Z is
  only supported at the end of the string preceded by a space.

###### Example

```sql
> select to_timestamp_seconds('2023-01-31T09:26:56.123456789-05:00');
+-------------------------------------------------------------------+
| to_timestamp_seconds(Utf8("2023-01-31T09:26:56.123456789-05:00")) |
+-------------------------------------------------------------------+
| 2023-01-31T14:26:56                                               |
+-------------------------------------------------------------------+
> select to_timestamp_seconds('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y');
+----------------------------------------------------------------------------------------------------------------+
| to_timestamp_seconds(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) |
+----------------------------------------------------------------------------------------------------------------+
| 2023-05-17T03:59:00                                                                                            |
+----------------------------------------------------------------------------------------------------------------+
```

Additional examples can be found [here](https://github.com/apache/datafusion/blob/main/datafusion-examples/examples/builtin_functions/date_time.rs)

##### `to_unixtime`

Converts a value to seconds since the unix epoch (`1970-01-01T00:00:00`).
Supports strings, dates, timestamps, integer, unsigned integer, and float types as input.
Strings are parsed as RFC3339 (e.g. '2023-07-20T05:44:00')
if no [Chrono formats](https://docs.rs/chrono/latest/chrono/format/strftime/index.html) are provided.
Integers, unsigned integers, and floats are interpreted as seconds since the unix epoch (`1970-01-01T00:00:00`).

```sql
to_unixtime(expression[, ..., format_n])
```

###### Arguments

- **expression**: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
- **format_n**: Optional [Chrono format](https://docs.rs/chrono/latest/chrono/format/strftime/index.html) strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned.

###### Example

```sql
> select to_unixtime('2020-09-08T12:00:00+00:00');
+------------------------------------------------+
| to_unixtime(Utf8("2020-09-08T12:00:00+00:00")) |
+------------------------------------------------+
| 1599566400                                     |
+------------------------------------------------+
> select to_unixtime('01-14-2023 01:01:30+05:30', '%q', '%d-%m-%Y %H/%M/%S', '%+', '%m-%d-%Y %H:%M:%S%#z');
+-----------------------------------------------------------------------------------------------------------------------------+
| to_unixtime(Utf8("01-14-2023 01:01:30+05:30"),Utf8("%q"),Utf8("%d-%m-%Y %H/%M/%S"),Utf8("%+"),Utf8("%m-%d-%Y %H:%M:%S%#z")) |
+-----------------------------------------------------------------------------------------------------------------------------+
| 1673638290                                                                                                                  |
+-----------------------------------------------------------------------------------------------------------------------------+
```

##### `today`

_Alias of [current_date](#current_date)._

### Array Functions

- [array_any_value](#array_any_value)
- [array_append](#array_append)
- [array_cat](#array_cat)
- [array_concat](#array_concat)
- [array_contains](#array_contains)
- [array_dims](#array_dims)
- [array_distance](#array_distance)
- [array_distinct](#array_distinct)
- [array_element](#array_element)
- [array_empty](#array_empty)
- [array_except](#array_except)
- [array_extract](#array_extract)
- [array_has](#array_has)
- [array_has_all](#array_has_all)
- [array_has_any](#array_has_any)
- [array_indexof](#array_indexof)
- [array_intersect](#array_intersect)
- [array_join](#array_join)
- [array_length](#array_length)
- [array_max](#array_max)
- [array_min](#array_min)
- [array_ndims](#array_ndims)
- [array_pop_back](#array_pop_back)
- [array_pop_front](#array_pop_front)
- [array_position](#array_position)
- [array_positions](#array_positions)
- [array_prepend](#array_prepend)
- [array_push_back](#array_push_back)
- [array_push_front](#array_push_front)
- [array_remove](#array_remove)
- [array_remove_all](#array_remove_all)
- [array_remove_n](#array_remove_n)
- [array_repeat](#array_repeat)
- [array_replace](#array_replace)
- [array_replace_all](#array_replace_all)
- [array_replace_n](#array_replace_n)
- [array_resize](#array_resize)
- [array_reverse](#array_reverse)
- [array_slice](#array_slice)
- [array_sort](#array_sort)
- [array_to_string](#array_to_string)
- [array_union](#array_union)
- [arrays_overlap](#arrays_overlap)
- [arrays_zip](#arrays_zip)
- [cardinality](#cardinality)
- [empty](#empty)
- [flatten](#flatten)
- [generate_series](#generate_series)
- [list_any_value](#list_any_value)
- [list_append](#list_append)
- [list_cat](#list_cat)
- [list_concat](#list_concat)
- [list_contains](#list_contains)
- [list_dims](#list_dims)
- [list_distance](#list_distance)
- [list_distinct](#list_distinct)
- [list_element](#list_element)
- [list_empty](#list_empty)
- [list_except](#list_except)
- [list_extract](#list_extract)
- [list_has](#list_has)
- [list_has_all](#list_has_all)
- [list_has_any](#list_has_any)
- [list_indexof](#list_indexof)
- [list_intersect](#list_intersect)
- [list_join](#list_join)
- [list_length](#list_length)
- [list_max](#list_max)
- [list_ndims](#list_ndims)
- [list_pop_back](#list_pop_back)
- [list_pop_front](#list_pop_front)
- [list_position](#list_position)
- [list_positions](#list_positions)
- [list_prepend](#list_prepend)
- [list_push_back](#list_push_back)
- [list_push_front](#list_push_front)
- [list_remove](#list_remove)
- [list_remove_all](#list_remove_all)
- [list_remove_n](#list_remove_n)
- [list_repeat](#list_repeat)
- [list_replace](#list_replace)
- [list_replace_all](#list_replace_all)
- [list_replace_n](#list_replace_n)
- [list_resize](#list_resize)
- [list_reverse](#list_reverse)
- [list_slice](#list_slice)
- [list_sort](#list_sort)
- [list_to_string](#list_to_string)
- [list_union](#list_union)
- [list_zip](#list_zip)
- [make_array](#make_array)
- [make_list](#make_list)
- [range](#range)
- [string_to_array](#string_to_array)
- [string_to_list](#string_to_list)

##### `array_any_value`

Returns the first non-null element in the array.

```sql
array_any_value(array)
```

###### Arguments

- **array**: Array expression. Can be a constant, column, or function, and any combination of array operators.

###### Example

```sql
> select array_any_value([NULL, 1, 2, 3]);
+-------------------------------+
| array_any_value(List([NULL,1,2,3])) |
+-------------------------------------+
| 1                                   |
+-------------------------------------+
```

###### Aliases

- list_any_value

##### `array_append`

Appends an element to the end of an array.

```sql
array_append(array, element)
```

###### Arguments

- **array**: Array expression. Can be a constant, column, or function, and any combination of array operators.
- **element**: Element to append to the array.

###### Example

```sql
> select array_append([1, 2, 3], 4);
+--------------------------------------+
| array_append(List([1,2,3]),Int64(4)) |
+--------------------------------------+
| [1, 2, 3, 4]                         |
+--------------------------------------+
```

###### Aliases

- list_append
- array_push_back
- list_push_back

##### `array_cat`

_Alias of [array_concat](#array_concat)._

##### `array_concat`

Concatenates arrays.

```sql
array_concat(array[, ..., array_n])
```

###### Arguments

- **array**: Array expression. Can be a constant, column, or function, and any combination of array operators.
- **array_n**: Subsequent array column or literal array to concatenate.

###### Example

```sql
> select array_concat([1, 2], [3, 4], [5, 6]);
+---------------------------------------------------+
| array_concat(List([1,2]),List([3,4]),List([5,6])) |
+---------------------------------------------------+
| [1, 2, 3, 4, 5, 6]                                |
+---------------------------------------------------+
```

###### Aliases

- array_cat
- list_concat
- list_cat

##### `array_contains`

_Alias of [array_has](#array_has)._

##### `array_dims`

Returns an array of the array's dimensions.

```sql
array_dims(array)
```

###### Arguments

- **array**: Array expression. Can be a constant, column, or function, and any combination of array operators.

###### Example

```sql
> select array_dims([[1, 2, 3], [4, 5, 6]]);
+---------------------------------+
| array_dims(List([1,2,3,4,5,6])) |
+---------------------------------+
| [2, 3]                          |
+---------------------------------+
```

###### Aliases

- list_dims

##### `array_distance`

Returns the Euclidean distance between two input arrays of equal length.

```sql
array_distance(array1, array2)
```

###### Arguments

- **array1**: Array expression. Can be a constant, column, or function, and any combination of array operators.
- **array2**: Array expression. Can be a constant, column, or function, and any combination of array operators.

###### Example

```sql
> select array_distance([1, 2], [1, 4]);
+------------------------------------+
| array_distance(List([1,2], [1,4])) |
+------------------------------------+
| 2.0                                |
+------------------------------------+
```

###### Aliases

- list_distance

##### `array_distinct`

Returns distinct values from the array after removing duplicates.

```sql
array_distinct(array)
```

###### Arguments

- **array**: Array expression. Can be a constant, column, or function, and any combination of array operators.

###### Example

```sql
> select array_distinct([1, 3, 2, 3, 1, 2, 4]);
+---------------------------------+
| array_distinct(List([1,2,3,4])) |
+---------------------------------+
| [1, 2, 3, 4]                    |
+---------------------------------+
```

###### Aliases

- list_distinct

##### `array_element`

Extracts the element with the index n from the array.

```sql
array_element(array, index)
```

###### Arguments

- **array**: Array expression. Can be a constant, column, or function, and any combination of array operators.
- **index**: Index to extract the element from the array.

###### Example

```sql
> select array_element([1, 2, 3, 4], 3);
+-----------------------------------------+
| array_element(List([1,2,3,4]),Int64(3)) |
+-----------------------------------------+
| 3                                       |
+-----------------------------------------+
```

###### Aliases

- array_extract
- list_element
- list_extract

##### `array_empty`

_Alias of [empty](#empty)._

##### `array_except`

Returns an array of the elements that appear in the first array but not in the second.

```sql
array_except(array1, array2)
```

###### Arguments

- **array1**: Array expression. Can be a constant, column, or function, and any combination of array operators.
- **array2**: Array expression. Can be a constant, column, or function, and any combination of array operators.

###### Example

```sql
> select array_except([1, 2, 3, 4], [5, 6, 3, 4]);
+----------------------------------------------------+
| array_except([1, 2, 3, 4], [5, 6, 3, 4]);           |
+----------------------------------------------------+
| [1, 2]                                              |
+----------------------------------------------------+
> select array_except([1, 2, 3, 4], [3, 4, 5, 6]);
+----------------------------------------------------+
| array_except([1, 2, 3, 4], [3, 4, 5, 6]);           |
+----------------------------------------------------+
| [1, 2]                                              |
+----------------------------------------------------+
```

###### Aliases

- list_except

##### `array_extract`

_Alias of [array_element](#array_element)._

##### `array_has`

Returns true if the array contains the element.

```sql
array_has(array, element)
```

###### Arguments

- **array**: Array expression. Can be a constant, column, or function, and any combination of array operators.
- **element**: Scalar or Array expression. Can be a constant, column, or function, and any combination of array operators.

###### Example

```sql
> select array_has([1, 2, 3], 2);
+-----------------------------+
| array_has(List([1,2,3]), 2) |
+-----------------------------+
| true                        |
+-----------------------------+
```

###### Aliases

- list_has
- array_contains
- list_contains

##### `array_has_all`

Returns true if all elements of sub-array exist in array.

```sql
array_has_all(array, sub-array)
```

###### Arguments

- **array**: Array expression. Can be a constant, column, or function, and any combination of array operators.
- **sub-array**: Array expression. Can be a constant, column, or function, and any combination of array operators.

###### Example

```sql
> select array_has_all([1, 2, 3, 4], [2, 3]);
+--------------------------------------------+
| array_has_all(List([1,2,3,4]), List([2,3])) |
+--------------------------------------------+
| true                                       |
+--------------------------------------------+
```

###### Aliases

- list_has_all

##### `array_has_any`

Returns true if the arrays have any elements in common.

```sql
array_has_any(array1, array2)
```

###### Arguments

- **array1**: Array expression. Can be a constant, column, or function, and any combination of array operators.
- **array2**: Array expression. Can be a constant, column, or function, and any combination of array operators.

###### Example

```sql
> select array_has_any([1, 2, 3], [3, 4]);
+------------------------------------------+
| array_has_any(List([1,2,3]), List([3,4])) |
+------------------------------------------+
| true                                     |
+------------------------------------------+
```

###### Aliases

- list_has_any
- arrays_overlap

##### `array_indexof`

_Alias of [array_position](#array_position)._

##### `array_intersect`

Returns an array of elements in the intersection of array1 and array2.

```sql
array_intersect(array1, array2)
```

###### Arguments

- **array1**: Array expression. Can be a constant, column, or function, and any combination of array operators.
- **array2**: Array expression. Can be a constant, column, or function, and any combination of array operators.

###### Example

```sql
> select array_intersect([1, 2, 3, 4], [5, 6, 3, 4]);
+----------------------------------------------------+
| array_intersect([1, 2, 3, 4], [5, 6, 3, 4]);       |
+----------------------------------------------------+
| [3, 4]                                             |
+----------------------------------------------------+
> select array_intersect([1, 2, 3, 4], [5, 6, 7, 8]);
+----------------------------------------------------+
| array_intersect([1, 2, 3, 4], [5, 6, 7, 8]);       |
+----------------------------------------------------+
| []                                                 |
+----------------------------------------------------+
```

###### Aliases

- list_intersect

##### `array_join`

_Alias of [array_to_string](#array_to_string)._

##### `array_length`

Returns the length of the array dimension.

```sql
array_length(array, dimension)
```

###### Arguments

- **array**: Array expression. Can be a constant, column, or function, and any combination of array operators.
- **dimension**: Array dimension.

###### Example

```sql
> select array_length([1, 2, 3, 4, 5], 1);
+-------------------------------------------+
| array_length(List([1,2,3,4,5]), 1)        |
+-------------------------------------------+
| 5                                         |
+-------------------------------------------+
```

###### Aliases

- list_length

##### `array_max`

Returns the maximum value in the array.

```sql
array_max(array)
```

###### Arguments

- **array**: Array expression. Can be a constant, column, or function, and any combination of array operators.

###### Example

```sql
> select array_max([3,1,4,2]);
+-----------------------------------------+
| array_max(List([3,1,4,2]))              |
+-----------------------------------------+
| 4                                       |
+-----------------------------------------+
```

###### Aliases

- list_max

##### `array_min`

Returns the minimum value in the array.

```sql
array_min(array)
```

###### Arguments

- **array**: Array expression. Can be a constant, column, or function, and any combination of array operators.

###### Example

```sql
> select array_min([3,1,4,2]);
+-----------------------------------------+
| array_min(List([3,1,4,2]))              |
+-----------------------------------------+
| 1                                       |
+-----------------------------------------+
```

##### `array_ndims`

Returns the number of dimensions of the array.

```sql
array_ndims(array, element)
```

###### Arguments

- **array**: Array expression. Can be a constant, column, or function, and any combination of array operators.
- **element**: Array element.

###### Example

```sql
> select array_ndims([[1, 2, 3], [4, 5, 6]]);
+----------------------------------+
| array_ndims(List([1,2,3,4,5,6])) |
+----------------------------------+
| 2                                |
+----------------------------------+
```

###### Aliases

- list_ndims

##### `array_pop_back`

Returns the array without the last element.

```sql
array_pop_back(array)
```

###### Arguments

- **array**: Array expression. Can be a constant, column, or function, and any combination of array operators.

###### Example

```sql
> select array_pop_back([1, 2, 3]);
+-------------------------------+
| array_pop_back(List([1,2,3])) |
+-------------------------------+
| [1, 2]                        |
+-------------------------------+
```

###### Aliases

- list_pop_back

##### `array_pop_front`

Returns the array without the first element.

```sql
array_pop_front(array)
```

###### Arguments

- **array**: Array expression. Can be a constant, column, or function, and any combination of array operators.

###### Example

```sql
> select array_pop_front([1, 2, 3]);
+-------------------------------+
| array_pop_front(List([1,2,3])) |
+-------------------------------+
| [2, 3]                        |
+-------------------------------+
```

###### Aliases

- list_pop_front

##### `array_position`

Returns the position of the first occurrence of the specified element in the array, or NULL if not found. Comparisons are done using `IS DISTINCT FROM` semantics, so NULL is considered to match NULL.

```sql
array_position(array, element)
array_position(array, element, index)
```

###### Arguments

- **array**: Array expression. Can be a constant, column, or function, and any combination of array operators.
- **element**: Element to search for in the array.
- **index**: Index at which to start searching (1-indexed).

###### Example

```sql
> select array_position([1, 2, 2, 3, 1, 4], 2);
+----------------------------------------------+
| array_position(List([1,2,2,3,1,4]),Int64(2)) |
+----------------------------------------------+
| 2                                            |
+----------------------------------------------+
> select array_position([1, 2, 2, 3, 1, 4], 2, 3);
+----------------------------------------------------+
| array_position(List([1,2,2,3,1,4]),Int64(2), Int64(3)) |
+----------------------------------------------------+
| 3                                                  |
+----------------------------------------------------+
```

###### Aliases

- list_position
- array_indexof
- list_indexof

##### `array_positions`

Searches for an element in the array, returns all occurrences.

```sql
array_positions(array, element)
```

###### Arguments

- **array**: Array expression. Can be a constant, column, or function, and any combination of array operators.
- **element**: Element to search for in the array.

###### Example

```sql
> select array_positions([1, 2, 2, 3, 1, 4], 2);
+-----------------------------------------------+
| array_positions(List([1,2,2,3,1,4]),Int64(2)) |
+-----------------------------------------------+
| [2, 3]                                        |
+-----------------------------------------------+
```

###### Aliases

- list_positions

##### `array_prepend`

Prepends an element to the beginning of an array.

```sql
array_prepend(element, array)
```

###### Arguments

- **array**: Array expression. Can be a constant, column, or function, and any combination of array operators.
- **element**: Element to prepend to the array.

###### Example

```sql
> select array_prepend(1, [2, 3, 4]);
+---------------------------------------+
| array_prepend(Int64(1),List([2,3,4])) |
+---------------------------------------+
| [1, 2, 3, 4]                          |
+---------------------------------------+
```

###### Aliases

- list_prepend
- array_push_front
- list_push_front

##### `array_push_back`

_Alias of [array_append](#array_append)._

##### `array_push_front`

_Alias of [array_prepend](#array_prepend)._

##### `array_remove`

Removes the first element from the array equal to the given value. NULL elements already in the array are preserved when removing a non-NULL value. If `element` evaluates to NULL, the result is NULL rather than removing NULL entries.

```sql
array_remove(array, element)
```

###### Arguments

- **array**: Array expression. Can be a constant, column, or function, and any combination of array operators.
- **element**: Element to be removed from the array.

###### Example

```sql
> select array_remove([1, 2, 2, 3, 2, 1, 4], 2);
+----------------------------------------------+
| array_remove(List([1,2,2,3,2,1,4]),Int64(2)) |
+----------------------------------------------+
| [1, 2, 3, 2, 1, 4]                           |
+----------------------------------------------+

> select array_remove([1, 2, NULL, 2, 4], 2);
+---------------------------------------------------+
| array_remove(List([1,2,NULL,2,4]),Int64(2)) |
+---------------------------------------------------+
| [1, NULL, 2, 4]                              |
+---------------------------------------------------+
```

###### Aliases

- list_remove

##### `array_remove_all`

Removes all elements from the array equal to the given value. NULL elements already in the array are preserved when removing a non-NULL value. If `element` evaluates to NULL, the result is NULL rather than removing NULL entries.

```sql
array_remove_all(array, element)
```

###### Arguments

- **array**: Array expression. Can be a constant, column, or function, and any combination of array operators.
- **element**: Element to be removed from the array.

###### Example

```sql
> select array_remove_all([1, 2, 2, 3, 2, 1, 4], 2);
+--------------------------------------------------+
| array_remove_all(List([1,2,2,3,2,1,4]),Int64(2)) |
+--------------------------------------------------+
| [1, 3, 1, 4]                                     |
+--------------------------------------------------+

> select array_remove_all([1, 2, NULL, 2, 4], 2);
+-----------------------------------------------------+
| array_remove_all(List([1,2,NULL,2,4]),Int64(2)) |
+-----------------------------------------------------+
| [1, NULL, 4]                                     |
+-----------------------------------------------------+
```

###### Aliases

- list_remove_all

##### `array_remove_n`

Removes the first `max` elements from the array equal to the given value. NULL elements already in the array are preserved when removing a non-NULL value. If `element` evaluates to NULL, the result is NULL rather than removing NULL entries.

```sql
array_remove_n(array, element, max)
```

###### Arguments

- **array**: Array expression. Can be a constant, column, or function, and any combination of array operators.
- **element**: Element to be removed from the array.
- **max**: Number of first occurrences to remove.

###### Example

```sql
> select array_remove_n([1, 2, 2, 3, 2, 1, 4], 2, 2);
+---------------------------------------------------------+
| array_remove_n(List([1,2,2,3,2,1,4]),Int64(2),Int64(2)) |
+---------------------------------------------------------+
| [1, 3, 2, 1, 4]                                         |
+---------------------------------------------------------+

> select array_remove_n([1, 2, NULL, 2, 4], 2, 2);
+----------------------------------------------------------+
| array_remove_n(List([1,2,NULL,2,4]),Int64(2),Int64(2)) |
+----------------------------------------------------------+
| [1, NULL, 4]                                            |
+----------------------------------------------------------+
```

###### Aliases

- list_remove_n

##### `array_repeat`

Returns an array containing element `count` times.

```sql
array_repeat(element, count)
```

###### Arguments

- **element**: Element expression. Can be a constant, column, or function, and any combination of array operators.
- **count**: Value of how many times to repeat the element.

###### Example

```sql
> select array_repeat(1, 3);
+---------------------------------+
| array_repeat(Int64(1),Int64(3)) |
+---------------------------------+
| [1, 1, 1]                       |
+---------------------------------+
> select array_repeat([1, 2], 2);
+------------------------------------+
| array_repeat(List([1,2]),Int64(2)) |
+------------------------------------+
| [[1, 2], [1, 2]]                   |
+------------------------------------+
```

###### Aliases

- list_repeat

##### `array_replace`

Replaces the first occurrence of the specified element with another specified element.

```sql
array_replace(array, from, to)
```

###### Arguments

- **array**: Array expression. Can be a constant, column, or function, and any combination of array operators.
- **from**: Initial element.
- **to**: Final element.

###### Example

```sql
> select array_replace([1, 2, 2, 3, 2, 1, 4], 2, 5);
+--------------------------------------------------------+
| array_replace(List([1,2,2,3,2,1,4]),Int64(2),Int64(5)) |
+--------------------------------------------------------+
| [1, 5, 2, 3, 2, 1, 4]                                  |
+--------------------------------------------------------+
```

###### Aliases

- list_replace

##### `array_replace_all`

Replaces all occurrences of the specified element with another specified element.

```sql
array_replace_all(array, from, to)
```

###### Arguments

- **array**: Array expression. Can be a constant, column, or function, and any combination of array operators.
- **from**: Initial element.
- **to**: Final element.

###### Example

```sql
> select array_replace_all([1, 2, 2, 3, 2, 1, 4], 2, 5);
+------------------------------------------------------------+
| array_replace_all(List([1,2,2,3,2,1,4]),Int64(2),Int64(5)) |
+------------------------------------------------------------+
| [1, 5, 5, 3, 5, 1, 4]                                      |
+------------------------------------------------------------+
```

###### Aliases

- list_replace_all

##### `array_replace_n`

Replaces the first `max` occurrences of the specified element with another specified element.

```sql
array_replace_n(array, from, to, max)
```

###### Arguments

- **array**: Array expression. Can be a constant, column, or function, and any combination of array operators.
- **from**: Initial element.
- **to**: Final element.
- **max**: Number of first occurrences to replace.

###### Example

```sql
> select array_replace_n([1, 2, 2, 3, 2, 1, 4], 2, 5, 2);
+-------------------------------------------------------------------+
| array_replace_n(List([1,2,2,3,2,1,4]),Int64(2),Int64(5),Int64(2)) |
+-------------------------------------------------------------------+
| [1, 5, 5, 3, 2, 1, 4]                                             |
+-------------------------------------------------------------------+
```

###### Aliases

- list_replace_n

##### `array_resize`

Resizes the list to contain size elements. Initializes new elements with value or empty if value is not set.

```sql
array_resize(array, size, value)
```

###### Arguments

- **array**: Array expression. Can be a constant, column, or function, and any combination of array operators.
- **size**: New size of given array.
- **value**: Defines new elements' value or empty if value is not set.

###### Example

```sql
> select array_resize([1, 2, 3], 5, 0);
+-------------------------------------+
| array_resize(List([1,2,3],5,0))     |
+-------------------------------------+
| [1, 2, 3, 0, 0]                     |
+-------------------------------------+
```

###### Aliases

- list_resize

##### `array_reverse`

Returns the array with the order of the elements reversed.

```sql
array_reverse(array)
```

###### Arguments

- **array**: Array expression. Can be a constant, column, or function, and any combination of array operators.

###### Example

```sql
> select array_reverse([1, 2, 3, 4]);
+------------------------------------------------------------+
| array_reverse(List([1, 2, 3, 4]))                          |
+------------------------------------------------------------+
| [4, 3, 2, 1]                                               |
+------------------------------------------------------------+
```

###### Aliases

- list_reverse

##### `array_slice`

Returns a slice of the array based on 1-indexed start and end positions.

```sql
array_slice(array, begin, end)
```

###### Arguments

- **array**: Array expression. Can be a constant, column, or function, and any combination of array operators.
- **begin**: Index of the first element. If negative, it counts backward from the end of the array.
- **end**: Index of the last element. If negative, it counts backward from the end of the array.
- **stride**: Stride of the array slice. The default is 1.

###### Example

```sql
> select array_slice([1, 2, 3, 4, 5, 6, 7, 8], 3, 6);
+--------------------------------------------------------+
| array_slice(List([1,2,3,4,5,6,7,8]),Int64(3),Int64(6)) |
+--------------------------------------------------------+
| [3, 4, 5, 6]                                           |
+--------------------------------------------------------+
```

###### Aliases

- list_slice

##### `array_sort`

Sort array.

```sql
array_sort(array, desc, nulls_first)
```

###### Arguments

- **array**: Array expression. Can be a constant, column, or function, and any combination of array operators.
- **desc**: Whether to sort in ascending (`ASC`) or descending (`DESC`) order. The default is `ASC`.
- **nulls_first**: Whether to sort nulls first (`NULLS FIRST`) or last (`NULLS LAST`). The default is `NULLS FIRST`.

###### Example

```sql
> select array_sort([3, 1, 2]);
+-----------------------------+
| array_sort(List([3,1,2]))   |
+-----------------------------+
| [1, 2, 3]                   |
+-----------------------------+
```

###### Aliases

- list_sort

##### `array_to_string`

Converts each element to its text representation.

```sql
array_to_string(array, delimiter[, null_string])
```

###### Arguments

- **array**: Array expression. Can be a constant, column, or function, and any combination of array operators.
- **delimiter**: Array element separator.
- **null_string**: Optional. String to use for null values in the output. If not provided, nulls will be omitted.

###### Example

```sql
> select array_to_string([[1, 2, 3, 4], [5, 6, 7, 8]], ',');
+----------------------------------------------------+
| array_to_string(List([1,2,3,4,5,6,7,8]),Utf8(",")) |
+----------------------------------------------------+
| 1,2,3,4,5,6,7,8                                    |
+----------------------------------------------------+
```

###### Aliases

- list_to_string
- array_join
- list_join

##### `array_union`

Returns an array of elements that are present in both arrays (all elements from both arrays) without duplicates.

```sql
array_union(array1, array2)
```

###### Arguments

- **array1**: Array expression. Can be a constant, column, or function, and any combination of array operators.
- **array2**: Array expression. Can be a constant, column, or function, and any combination of array operators.

###### Example

```sql
> select array_union([1, 2, 3, 4], [5, 6, 3, 4]);
+----------------------------------------------------+
| array_union([1, 2, 3, 4], [5, 6, 3, 4]);           |
+----------------------------------------------------+
| [1, 2, 3, 4, 5, 6]                                 |
+----------------------------------------------------+
> select array_union([1, 2, 3, 4], [5, 6, 7, 8]);
+----------------------------------------------------+
| array_union([1, 2, 3, 4], [5, 6, 7, 8]);           |
+----------------------------------------------------+
| [1, 2, 3, 4, 5, 6, 7, 8]                           |
+----------------------------------------------------+
```

###### Aliases

- list_union

##### `arrays_overlap`

_Alias of [array_has_any](#array_has_any)._

##### `arrays_zip`

Returns an array of structs created by combining the elements of each input array at the same index. If the arrays have different lengths, shorter arrays are padded with NULLs.

```sql
arrays_zip(array1[, ..., array_n])
```

###### Arguments

- **array1**: First array expression.
- **array_n**: Optional additional array expressions.

###### Example

```sql
> select arrays_zip([1, 2, 3]);
+---------------------------------------------------+
| arrays_zip([1, 2, 3])                             |
+---------------------------------------------------+
| [{1: 1}, {1: 2}, {1: 3}]                          |
+---------------------------------------------------+
> select arrays_zip([1, 2], [3, 4, 5]);
+---------------------------------------------------+
| arrays_zip([1, 2], [3, 4, 5])                     |
+---------------------------------------------------+
| [{1: 1, 2: 3}, {1: 2, 2: 4}, {1: NULL, 2: 5}]     |
+---------------------------------------------------+
```

###### Aliases

- list_zip

##### `cardinality`

Returns the total number of elements in the array.

```sql
cardinality(array)
```

###### Arguments

- **array**: Array expression. Can be a constant, column, or function, and any combination of array operators.

###### Example

```sql
> select cardinality([[1, 2, 3, 4], [5, 6, 7, 8]]);
+--------------------------------------+
| cardinality(List([1,2,3,4,5,6,7,8])) |
+--------------------------------------+
| 8                                    |
+--------------------------------------+
```

##### `empty`

Returns 1 for an empty array or 0 for a non-empty array.

```sql
empty(array)
```

###### Arguments

- **array**: Array expression. Can be a constant, column, or function, and any combination of array operators.

###### Example

```sql
> select empty([1]);
+------------------+
| empty(List([1])) |
+------------------+
| 0                |
+------------------+
```

###### Aliases

- array_empty
- list_empty

##### `flatten`

Converts an array of arrays to a flat array.

- Applies to any depth of nested arrays
- Does not change arrays that are already flat

The flattened array contains all the elements from all source arrays.

```sql
flatten(array)
```

###### Arguments

- **array**: Array expression. Can be a constant, column, or function, and any combination of array operators.

###### Example

```sql
> select flatten([[1, 2], [3, 4]]);
+------------------------------+
| flatten(List([1,2], [3,4]))  |
+------------------------------+
| [1, 2, 3, 4]                 |
+------------------------------+
```

##### `generate_series`

Similar to the range function, but it includes the upper bound.

```sql
generate_series(stop)
generate_series(start, stop[, step])
```

###### Arguments

- **start**: Start of the series. Ints, timestamps, dates or string types that can be coerced to Date32 are supported.
- **end**: End of the series (included). Type must be the same as start.
- **step**: Increase by step (can not be 0). Steps less than a day are supported only for timestamp ranges.

###### Example

```sql
> select generate_series(1,3);
+------------------------------------+
| generate_series(Int64(1),Int64(3)) |
+------------------------------------+
| [1, 2, 3]                          |
+------------------------------------+
```

##### `list_any_value`

_Alias of [array_any_value](#array_any_value)._

##### `list_append`

_Alias of [array_append](#array_append)._

##### `list_cat`

_Alias of [array_concat](#array_concat)._

##### `list_concat`

_Alias of [array_concat](#array_concat)._

##### `list_contains`

_Alias of [array_has](#array_has)._

##### `list_dims`

_Alias of [array_dims](#array_dims)._

##### `list_distance`

_Alias of [array_distance](#array_distance)._

##### `list_distinct`

_Alias of [array_distinct](#array_distinct)._

##### `list_element`

_Alias of [array_element](#array_element)._

##### `list_empty`

_Alias of [empty](#empty)._

##### `list_except`

_Alias of [array_except](#array_except)._

##### `list_extract`

_Alias of [array_element](#array_element)._

##### `list_has`

_Alias of [array_has](#array_has)._

##### `list_has_all`

_Alias of [array_has_all](#array_has_all)._

##### `list_has_any`

_Alias of [array_has_any](#array_has_any)._

##### `list_indexof`

_Alias of [array_position](#array_position)._

##### `list_intersect`

_Alias of [array_intersect](#array_intersect)._

##### `list_join`

_Alias of [array_to_string](#array_to_string)._

##### `list_length`

_Alias of [array_length](#array_length)._

##### `list_max`

_Alias of [array_max](#array_max)._

##### `list_ndims`

_Alias of [array_ndims](#array_ndims)._

##### `list_pop_back`

_Alias of [array_pop_back](#array_pop_back)._

##### `list_pop_front`

_Alias of [array_pop_front](#array_pop_front)._

##### `list_position`

_Alias of [array_position](#array_position)._

##### `list_positions`

_Alias of [array_positions](#array_positions)._

##### `list_prepend`

_Alias of [array_prepend](#array_prepend)._

##### `list_push_back`

_Alias of [array_append](#array_append)._

##### `list_push_front`

_Alias of [array_prepend](#array_prepend)._

##### `list_remove`

_Alias of [array_remove](#array_remove)._

##### `list_remove_all`

_Alias of [array_remove_all](#array_remove_all)._

##### `list_remove_n`

_Alias of [array_remove_n](#array_remove_n)._

##### `list_repeat`

_Alias of [array_repeat](#array_repeat)._

##### `list_replace`

_Alias of [array_replace](#array_replace)._

##### `list_replace_all`

_Alias of [array_replace_all](#array_replace_all)._

##### `list_replace_n`

_Alias of [array_replace_n](#array_replace_n)._

##### `list_resize`

_Alias of [array_resize](#array_resize)._

##### `list_reverse`

_Alias of [array_reverse](#array_reverse)._

##### `list_slice`

_Alias of [array_slice](#array_slice)._

##### `list_sort`

_Alias of [array_sort](#array_sort)._

##### `list_to_string`

_Alias of [array_to_string](#array_to_string)._

##### `list_union`

_Alias of [array_union](#array_union)._

##### `list_zip`

_Alias of [arrays_zip](#arrays_zip)._

##### `make_array`

Returns an array using the specified input expressions.

```sql
make_array(expression1[, ..., expression_n])
```

###### Arguments

- **expression_n**: Expression to include in the output array. Can be a constant, column, or function, and any combination of arithmetic or string operators.

###### Example

```sql
> select make_array(1, 2, 3, 4, 5);
+----------------------------------------------------------+
| make_array(Int64(1),Int64(2),Int64(3),Int64(4),Int64(5)) |
+----------------------------------------------------------+
| [1, 2, 3, 4, 5]                                          |
+----------------------------------------------------------+
```

###### Aliases

- make_list

##### `make_list`

_Alias of [make_array](#make_array)._

##### `range`

Returns an Arrow array between start and stop with step. The range start..end contains all values with start &lt;= x &lt; end. It is empty if start >= end. Step cannot be 0.

```sql
range(stop)
range(start, stop[, step])
```

###### Arguments

- **start**: Start of the range. Ints, timestamps, dates or string types that can be coerced to Date32 are supported.
- **end**: End of the range (not included). Type must be the same as start.
- **step**: Increase by step (cannot be 0). Steps less than a day are supported only for timestamp ranges.

###### Example

```sql
> select range(2, 10, 3);
+-----------------------------------+
| range(Int64(2),Int64(10),Int64(3))|
+-----------------------------------+
| [2, 5, 8]                         |
+-----------------------------------+

> select range(DATE '1992-09-01', DATE '1993-03-01', INTERVAL '1' MONTH);
+--------------------------------------------------------------------------+
| range(DATE '1992-09-01', DATE '1993-03-01', INTERVAL '1' MONTH)          |
+--------------------------------------------------------------------------+
| [1992-09-01, 1992-10-01, 1992-11-01, 1992-12-01, 1993-01-01, 1993-02-01] |
+--------------------------------------------------------------------------+
```

##### `string_to_array`

Splits a string into an array of substrings based on a delimiter. Any substrings matching the optional `null_str` argument are replaced with NULL.

```sql
string_to_array(str, delimiter[, null_str])
```

###### Arguments

- **str**: String expression to split.
- **delimiter**: Delimiter string to split on.
- **null_str**: Substring values to be replaced with `NULL`.

###### Example

```sql
> select string_to_array('abc##def', '##');
+-----------------------------------+
| string_to_array(Utf8('abc##def'))  |
+-----------------------------------+
| ['abc', 'def']                    |
+-----------------------------------+
> select string_to_array('abc def', ' ', 'def');
+---------------------------------------------+
| string_to_array(Utf8('abc def'), Utf8(' '), Utf8('def')) |
+---------------------------------------------+
| ['abc', NULL]                               |
+---------------------------------------------+
```

###### Aliases

- string_to_list

##### `string_to_list`

_Alias of [string_to_array](#string_to_array)._

### Struct Functions

- [named_struct](#named_struct)
- [row](#row)
- [struct](#struct)

##### `named_struct`

Returns an Arrow struct using the specified name and input expressions pairs.
For information on comparing and ordering struct values (including `NULL` handling),
see [Comparison and Ordering](df-functions/struct_coercion.md#comparison-and-ordering).

```sql
named_struct(expression1_name, expression1_input[, ..., expression_n_name, expression_n_input])
```

###### Arguments

- **expression_n_name**: Name of the column field. Must be a constant string.
- **expression_n_input**: Expression to include in the output struct. Can be a constant, column, or function, and any combination of arithmetic or string operators.

###### Example

For example, this query converts two columns `a` and `b` to a single column with
a struct type of fields `field_a` and `field_b`:

```sql
> select * from t;
+---+---+
| a | b |
+---+---+
| 1 | 2 |
| 3 | 4 |
+---+---+
> select named_struct('field_a', a, 'field_b', b) from t;
+-------------------------------------------------------+
| named_struct(Utf8("field_a"),t.a,Utf8("field_b"),t.b) |
+-------------------------------------------------------+
| {field_a: 1, field_b: 2}                              |
| {field_a: 3, field_b: 4}                              |
+-------------------------------------------------------+
```

##### `row`

_Alias of [struct](#struct)._

##### `struct`

Returns an Arrow struct using the specified input expressions optionally named.
Fields in the returned struct use the optional name or the `cN` naming convention.
For example: `c0`, `c1`, `c2`, etc.
For information on comparing and ordering struct values (including `NULL` handling),
see [Comparison and Ordering](df-functions/struct_coercion.md#comparison-and-ordering).

```sql
struct(expression1[, ..., expression_n])
```

###### Arguments

- **expression1, expression_n**: Expression to include in the output struct. Can be a constant, column, or function, any combination of arithmetic or string operators.

###### Example

For example, this query converts two columns `a` and `b` to a single column with
a struct type of fields `field_a` and `c1`:

```sql
> select * from t;
+---+---+
| a | b |
+---+---+
| 1 | 2 |
| 3 | 4 |
+---+---+

-- use default names `c0`, `c1`
> select struct(a, b) from t;
+-----------------+
| struct(t.a,t.b) |
+-----------------+
| {c0: 1, c1: 2}  |
| {c0: 3, c1: 4}  |
+-----------------+

-- name the first field `field_a`
select struct(a as field_a, b) from t;
+--------------------------------------------------+
| named_struct(Utf8("field_a"),t.a,Utf8("c1"),t.b) |
+--------------------------------------------------+
| {field_a: 1, c1: 2}                              |
| {field_a: 3, c1: 4}                              |
+--------------------------------------------------+
```

###### Aliases

- row

### Map Functions

- [element_at](#element_at)
- [map](#map)
- [map_entries](#map_entries)
- [map_extract](#map_extract)
- [map_keys](#map_keys)
- [map_values](#map_values)

##### `element_at`

_Alias of [map_extract](#map_extract)._

##### `map`

Returns an Arrow map with the specified key-value pairs.

The `make_map` function creates a map from two lists: one for keys and one for values. Each key must be unique and non-null.

```sql
map(key, value)
map(key: value)
make_map(['key1', 'key2'], ['value1', 'value2'])
```

###### Arguments

- **key**: For `map`: Expression to be used for key. Can be a constant, column, function, or any combination of arithmetic or string operators.
  For `make_map`: The list of keys to be used in the map. Each key must be unique and non-null.
- **value**: For `map`: Expression to be used for value. Can be a constant, column, function, or any combination of arithmetic or string operators.
  For `make_map`: The list of values to be mapped to the corresponding keys.

###### Example

```sql
-- Using map function
SELECT MAP('type', 'test');
----
{type: test}

SELECT MAP(['POST', 'HEAD', 'PATCH'], [41, 33, null]);
----
{POST: 41, HEAD: 33, PATCH: NULL}

SELECT MAP([[1,2], [3,4]], ['a', 'b']);
----
{[1, 2]: a, [3, 4]: b}

SELECT MAP { 'a': 1, 'b': 2 };
----
{a: 1, b: 2}

-- Using make_map function
SELECT MAKE_MAP(['POST', 'HEAD'], [41, 33]);
----
{POST: 41, HEAD: 33}

SELECT MAKE_MAP(['key1', 'key2'], ['value1', null]);
----
{key1: value1, key2: }
```

##### `map_entries`

Returns a list of all entries in the map.

```sql
map_entries(map)
```

###### Arguments

- **map**: Map expression. Can be a constant, column, or function, and any combination of map operators.

###### Example

```sql
SELECT map_entries(MAP {'a': 1, 'b': NULL, 'c': 3});
----
[{'key': a, 'value': 1}, {'key': b, 'value': NULL}, {'key': c, 'value': 3}]

SELECT map_entries(map([100, 5], [42, 43]));
----
[{'key': 100, 'value': 42}, {'key': 5, 'value': 43}]
```

##### `map_extract`

Returns a list containing the value for the given key or an empty list if the key is not present in the map.

```sql
map_extract(map, key)
```

###### Arguments

- **map**: Map expression. Can be a constant, column, or function, and any combination of map operators.
- **key**: Key to extract from the map. Can be a constant, column, or function, any combination of arithmetic or string operators, or a named expression of the previously listed.

###### Example

```sql
SELECT map_extract(MAP {'a': 1, 'b': NULL, 'c': 3}, 'a');
----
[1]

SELECT map_extract(MAP {1: 'one', 2: 'two'}, 2);
----
['two']

SELECT map_extract(MAP {'x': 10, 'y': NULL, 'z': 30}, 'y');
----
[]
```

###### Aliases

- element_at

##### `map_keys`

Returns a list of all keys in the map.

```sql
map_keys(map)
```

###### Arguments

- **map**: Map expression. Can be a constant, column, or function, and any combination of map operators.

###### Example

```sql
SELECT map_keys(MAP {'a': 1, 'b': NULL, 'c': 3});
----
[a, b, c]

SELECT map_keys(map([100, 5], [42, 43]));
----
[100, 5]
```

##### `map_values`

Returns a list of all values in the map.

```sql
map_values(map)
```

###### Arguments

- **map**: Map expression. Can be a constant, column, or function, and any combination of map operators.

###### Example

```sql
SELECT map_values(MAP {'a': 1, 'b': NULL, 'c': 3});
----
[1, , 3]

SELECT map_values(map([100, 5], [42, 43]));
----
[42, 43]
```

### Hashing Functions

- [digest](#digest)
- [md5](#md5)
- [sha224](#sha224)
- [sha256](#sha256)
- [sha384](#sha384)
- [sha512](#sha512)

##### `digest`

Computes the binary hash of an expression using the specified algorithm.

```sql
digest(expression, algorithm)
```

###### Arguments

- **expression**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **algorithm**: String expression specifying algorithm to use. Must be one of:
  - md5
  - sha224
  - sha256
  - sha384
  - sha512
  - blake2s
  - blake2b
  - blake3

###### Example

```sql
> select digest('foo', 'sha256');
+------------------------------------------------------------------+
| digest(Utf8("foo"),Utf8("sha256"))                               |
+------------------------------------------------------------------+
| 2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f98a5e886266e7ae |
+------------------------------------------------------------------+
```

##### `md5`

Computes an MD5 128-bit checksum for a string expression.

```sql
md5(expression)
```

###### Arguments

- **expression**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> select md5('foo');
+----------------------------------+
| md5(Utf8("foo"))                 |
+----------------------------------+
| acbd18db4cc2f85cedef654fccc4a4d8 |
+----------------------------------+
```

##### `sha224`

Computes the SHA-224 hash of a binary string.

```sql
sha224(expression)
```

###### Arguments

- **expression**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> select sha224('foo');
+----------------------------------------------------------+
| sha224(Utf8("foo"))                                      |
+----------------------------------------------------------+
| 0808f64e60d58979fcb676c96ec938270dea42445aeefcd3a4e6f8db |
+----------------------------------------------------------+
```

##### `sha256`

Computes the SHA-256 hash of a binary string.

```sql
sha256(expression)
```

###### Arguments

- **expression**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> select sha256('foo');
+------------------------------------------------------------------+
| sha256(Utf8("foo"))                                              |
+------------------------------------------------------------------+
| 2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f98a5e886266e7ae |
+------------------------------------------------------------------+
```

##### `sha384`

Computes the SHA-384 hash of a binary string.

```sql
sha384(expression)
```

###### Arguments

- **expression**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> select sha384('foo');
+--------------------------------------------------------------------------------------------------+
| sha384(Utf8("foo"))                                                                              |
+--------------------------------------------------------------------------------------------------+
| 98c11ffdfdd540676b1a137cb1a22b2a70350c9a44171d6b1180c6be5cbb2ee3f79d532c8a1dd9ef2e8e08e752a3babb |
+--------------------------------------------------------------------------------------------------+
```

##### `sha512`

Computes the SHA-512 hash of a binary string.

```sql
sha512(expression)
```

###### Arguments

- **expression**: String expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> select sha512('foo');
+----------------------------------------------------------------------------------------------------------------------------------+
| sha512(Utf8("foo"))                                                                                                              |
+----------------------------------------------------------------------------------------------------------------------------------+
| f7fbba6e0636f890e56fbbf3283e524c6fa3204ae298382d624741d0dc6638326e282c41be5e4254d8820772c5518a2c5a8c0c7f7eda19594a7eb539453e1ed7 |
+----------------------------------------------------------------------------------------------------------------------------------+
```

### Union Functions

Functions to work with the union data type, also know as tagged unions, variant types, enums or sum types. Note: Not related to the SQL UNION operator

- [union_extract](#union_extract)
- [union_tag](#union_tag)

##### `union_extract`

Returns the value of the given field in the union when selected, or NULL otherwise.

```sql
union_extract(union, field_name)
```

###### Arguments

- **union**: Union expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **field_name**: String expression to operate on. Must be a constant.

###### Example

```sql
❯ select union_column, union_extract(union_column, 'a'), union_extract(union_column, 'b') from table_with_union;
+--------------+----------------------------------+----------------------------------+
| union_column | union_extract(union_column, 'a') | union_extract(union_column, 'b') |
+--------------+----------------------------------+----------------------------------+
| {a=1}        | 1                                |                                  |
| {b=3.0}      |                                  | 3.0                              |
| {a=4}        | 4                                |                                  |
| {b=}         |                                  |                                  |
| {a=}         |                                  |                                  |
+--------------+----------------------------------+----------------------------------+
```

##### `union_tag`

Returns the name of the currently selected field in the union

```sql
union_tag(union_expression)
```

###### Arguments

- **union**: Union expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
❯ select union_column, union_tag(union_column) from table_with_union;
+--------------+-------------------------+
| union_column | union_tag(union_column) |
+--------------+-------------------------+
| {a=1}        | a                       |
| {b=3.0}      | b                       |
| {a=4}        | a                       |
| {b=}         | b                       |
| {a=}         | a                       |
+--------------+-------------------------+
```

### Other Functions

- [arrow_cast](#arrow_cast)
- [arrow_metadata](#arrow_metadata)
- [arrow_try_cast](#arrow_try_cast)
- [arrow_typeof](#arrow_typeof)
- [get_field](#get_field)
- [version](#version)

##### `arrow_cast`

Casts a value to a specific Arrow data type.

```sql
arrow_cast(expression, datatype)
```

###### Arguments

- **expression**: Expression to cast. The expression can be a constant, column, or function, and any combination of operators.
- **datatype**: [Arrow data type](https://docs.rs/arrow/latest/arrow/datatypes/enum.DataType.html) name to cast to, as a string. The format is the same as that returned by [`arrow_typeof`]

###### Example

```sql
> select
  arrow_cast(-5,    'Int8') as a,
  arrow_cast('foo', 'Dictionary(Int32, Utf8)') as b,
  arrow_cast('bar', 'LargeUtf8') as c;

+----+-----+-----+
| a  | b   | c   |
+----+-----+-----+
| -5 | foo | bar |
+----+-----+-----+

> select
  arrow_cast('2023-01-02T12:53:02', 'Timestamp(µs, "+08:00")') as d,
  arrow_cast('2023-01-02T12:53:02', 'Timestamp(µs)') as e;

+---------------------------+---------------------+
| d                         | e                   |
+---------------------------+---------------------+
| 2023-01-02T12:53:02+08:00 | 2023-01-02T12:53:02 |
+---------------------------+---------------------+
```

##### `arrow_metadata`

Returns the metadata of the input expression. If a key is provided, returns the value for that key. If no key is provided, returns a Map of all metadata.

```sql
arrow_metadata(expression[, key])
```

###### Arguments

- **expression**: The expression to retrieve metadata from. Can be a column or other expression.
- **key**: Optional. The specific metadata key to retrieve.

###### Example

```sql
> select arrow_metadata(col) from table;
+----------------------------+
| arrow_metadata(table.col)  |
+----------------------------+
| {k: v}                     |
+----------------------------+
> select arrow_metadata(col, 'k') from table;
+-------------------------------+
| arrow_metadata(table.col, 'k')|
+-------------------------------+
| v                             |
+-------------------------------+
```

##### `arrow_try_cast`

Casts a value to a specific Arrow data type, returning NULL if the cast fails.

```sql
arrow_try_cast(expression, datatype)
```

###### Arguments

- **expression**: Expression to cast. The expression can be a constant, column, or function, and any combination of operators.
- **datatype**: [Arrow data type](https://docs.rs/arrow/latest/arrow/datatypes/enum.DataType.html) name to cast to, as a string. The format is the same as that returned by [`arrow_typeof`]

###### Example

```sql
> select arrow_try_cast('123', 'Int64') as a,
         arrow_try_cast('not_a_number', 'Int64') as b;

+-----+------+
| a   | b    |
+-----+------+
| 123 | NULL |
+-----+------+
```

##### `arrow_typeof`

Returns the name of the underlying [Arrow data type](https://docs.rs/arrow/latest/arrow/datatypes/enum.DataType.html) of the expression.

```sql
arrow_typeof(expression)
```

###### Arguments

- **expression**: Expression to evaluate. The expression can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> select arrow_typeof('foo'), arrow_typeof(1);
+---------------------------+------------------------+
| arrow_typeof(Utf8("foo")) | arrow_typeof(Int64(1)) |
+---------------------------+------------------------+
| Utf8                      | Int64                  |
+---------------------------+------------------------+
```

##### `get_field`

Returns a field within a map or a struct with the given key.
Supports nested field access by providing multiple field names.
Note: most users invoke `get_field` indirectly via field access
syntax such as `my_struct_col['field_name']` which results in a call to
`get_field(my_struct_col, 'field_name')`.
Nested access like `my_struct['a']['b']` is optimized to a single call:
`get_field(my_struct, 'a', 'b')`.

```sql
get_field(expression, field_name[, field_name2, ...])
```

###### Arguments

- **expression**: The map or struct to retrieve a field from.
- **field_name**: The field name(s) to access, in order for nested access. Must evaluate to strings.

###### Example

```sql
> -- Access a field from a struct column
> create table test( struct_col) as values
    ({name: 'Alice', age: 30}),
    ({name: 'Bob', age: 25});
> select struct_col from test;
+-----------------------------+
| struct_col                  |
+-----------------------------+
| {name: Alice, age: 30}      |
| {name: Bob, age: 25}        |
+-----------------------------+
> select struct_col['name'] as name from test;
+-------+
| name  |
+-------+
| Alice |
| Bob   |
+-------+

> -- Nested field access with multiple arguments
> create table test(struct_col) as values
    ({outer: {inner_val: 42}});
> select struct_col['outer']['inner_val'] as result from test;
+--------+
| result |
+--------+
| 42     |
+--------+
```

##### `version`

Returns the version of DataFusion.

```sql
version()
```

###### Example

```sql
> select version();
+--------------------------------------------+
| version()                                  |
+--------------------------------------------+
| Apache DataFusion 42.0.0, aarch64 on macos |
+--------------------------------------------+
```
<!---
  Licensed to the Apache Software Foundation (ASF) under one
  or more contributor license agreements.  See the NOTICE file
  distributed with this work for additional information
  regarding copyright ownership.  The ASF licenses this file
  to you under the Apache License, Version 2.0 (the
  "License"); you may not use this file except in compliance
  with the License.  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing,
  software distributed under the License is distributed on an
  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
  KIND, either express or implied.  See the License for the
  specific language governing permissions and limitations
  under the License.
-->

<!---
This file was generated by the dev/update_function_docs.sh script.
Do not edit it manually as changes will be overwritten.
Instead, edit the AggregateUDFImpl's documentation() function to
update documentation for an individual UDF or the
dev/update_function_docs.sh file for updating surrounding text.
-->

## Aggregate Functions

Aggregate functions operate on a set of values to compute a single result.

### Filter clause

Aggregate functions support the SQL `FILTER (WHERE ...)` clause to restrict which input rows contribute to the aggregate result.

```sql
function([exprs]) FILTER (WHERE condition)
```

Example:

```sql
SELECT
  sum(salary) FILTER (WHERE salary > 0) AS sum_positive_salaries,
  count(*)    FILTER (WHERE active)     AS active_count
FROM employees;
```

Note: When no rows pass the filter, `COUNT` returns `0` while `SUM`/`AVG`/`MIN`/`MAX` return `NULL`.

### WITHIN GROUP / Ordered-set aggregates

Some aggregate functions accept the SQL `WITHIN GROUP (ORDER BY ...)` clause to specify the ordering the
aggregate relies on. In DataFusion this is opt-in: only aggregate functions whose implementation returns
`true` from `AggregateUDFImpl::supports_within_group_clause()` accept the `WITHIN GROUP` clause. Attempting to
use `WITHIN GROUP` with a regular aggregate (for example, `SELECT SUM(x) WITHIN GROUP (ORDER BY x)`) will fail
during planning with an error: "WITHIN GROUP is only supported for ordered-set aggregate functions".

Currently, the built-in aggregate functions that support `WITHIN GROUP` are:

- `percentile_cont` — exact percentile aggregate (also available as `percentile_cont(column, percentile)`)
- `approx_percentile_cont` — approximate percentile using the t-digest algorithm
- `approx_percentile_cont_with_weight` — approximate weighted percentile using the t-digest algorithm

Note: rank-like functions such as `rank()`, `dense_rank()`, and `percent_rank()` are window functions and
use the `OVER (...)` clause; they are not ordered-set aggregates that accept `WITHIN GROUP` in DataFusion.

Example (ordered-set aggregate):

```sql
percentile_cont(0.5) WITHIN GROUP (ORDER BY value)
```

Example (invalid usage — planner will error):

```sql
-- This will fail: SUM is not an ordered-set aggregate
SELECT SUM(x) WITHIN GROUP (ORDER BY x) FROM t;
```

### General Functions

- [array_agg](#array_agg)
- [avg](#avg)
- [bit_and](#bit_and)
- [bit_or](#bit_or)
- [bit_xor](#bit_xor)
- [bool_and](#bool_and)
- [bool_or](#bool_or)
- [count](#count)
- [first_value](#first_value)
- [grouping](#grouping)
- [last_value](#last_value)
- [max](#max)
- [mean](#mean)
- [median](#median)
- [min](#min)
- [percentile_cont](#percentile_cont)
- [quantile_cont](#quantile_cont)
- [string_agg](#string_agg)
- [sum](#sum)
- [var](#var)
- [var_pop](#var_pop)
- [var_population](#var_population)
- [var_samp](#var_samp)
- [var_sample](#var_sample)

##### `array_agg`

Returns an array created from the expression elements. If ordering is required, elements are inserted in the specified order.
This aggregation function can only mix DISTINCT and ORDER BY if the ordering expression is exactly the same as the argument expression.

```sql
array_agg(expression [ORDER BY expression])
```

###### Arguments

- **expression**: The expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT array_agg(column_name ORDER BY other_column) FROM table_name;
+-----------------------------------------------+
| array_agg(column_name ORDER BY other_column)  |
+-----------------------------------------------+
| [element1, element2, element3]                |
+-----------------------------------------------+
> SELECT array_agg(DISTINCT column_name ORDER BY column_name) FROM table_name;
+--------------------------------------------------------+
| array_agg(DISTINCT column_name ORDER BY column_name)  |
+--------------------------------------------------------+
| [element1, element2, element3]                         |
+--------------------------------------------------------+
```

##### `avg`

Returns the average of numeric values in the specified column.

```sql
avg(expression)
```

###### Arguments

- **expression**: The expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT avg(column_name) FROM table_name;
+---------------------------+
| avg(column_name)           |
+---------------------------+
| 42.75                      |
+---------------------------+
```

###### Aliases

- mean

##### `bit_and`

Computes the bitwise AND of all non-null input values.

```sql
bit_and(expression)
```

###### Arguments

- **expression**: Integer expression to operate on. Can be a constant, column, or function, and any combination of operators.

##### `bit_or`

Computes the bitwise OR of all non-null input values.

```sql
bit_or(expression)
```

###### Arguments

- **expression**: Integer expression to operate on. Can be a constant, column, or function, and any combination of operators.

##### `bit_xor`

Computes the bitwise exclusive OR of all non-null input values.

```sql
bit_xor(expression)
```

###### Arguments

- **expression**: Integer expression to operate on. Can be a constant, column, or function, and any combination of operators.

##### `bool_and`

Returns true if all non-null input values are true, otherwise false.

```sql
bool_and(expression)
```

###### Arguments

- **expression**: The expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT bool_and(column_name) FROM table_name;
+----------------------------+
| bool_and(column_name)       |
+----------------------------+
| true                        |
+----------------------------+
```

##### `bool_or`

Returns true if all non-null input values are true, otherwise false.

```sql
bool_and(expression)
```

###### Arguments

- **expression**: The expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT bool_and(column_name) FROM table_name;
+----------------------------+
| bool_and(column_name)       |
+----------------------------+
| true                        |
+----------------------------+
```

##### `count`

Returns the number of non-null values in the specified column. To include null values in the total count, use `count(*)`.

```sql
count(expression)
```

###### Arguments

- **expression**: The expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT count(column_name) FROM table_name;
+-----------------------+
| count(column_name)     |
+-----------------------+
| 100                   |
+-----------------------+

> SELECT count(*) FROM table_name;
+------------------+
| count(*)         |
+------------------+
| 120              |
+------------------+
```

##### `first_value`

Returns the first element in an aggregation group according to the requested ordering. If no ordering is given, returns an arbitrary element from the group.

```sql
first_value(expression [ORDER BY expression])
```

###### Arguments

- **expression**: The expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT first_value(column_name ORDER BY other_column) FROM table_name;
+-----------------------------------------------+
| first_value(column_name ORDER BY other_column)|
+-----------------------------------------------+
| first_element                                 |
+-----------------------------------------------+
```

##### `grouping`

Returns 1 if the data is aggregated across the specified column, or 0 if it is not aggregated in the result set.

```sql
grouping(expression)
```

###### Arguments

- **expression**: Expression to evaluate whether data is aggregated across the specified column. Can be a constant, column, or function.

###### Example

```sql
> SELECT column_name, GROUPING(column_name) AS group_column
  FROM table_name
  GROUP BY GROUPING SETS ((column_name), ());
+-------------+-------------+
| column_name | group_column |
+-------------+-------------+
| value1      | 0           |
| value2      | 0           |
| NULL        | 1           |
+-------------+-------------+
```

##### `last_value`

Returns the last element in an aggregation group according to the requested ordering. If no ordering is given, returns an arbitrary element from the group.

```sql
last_value(expression [ORDER BY expression])
```

###### Arguments

- **expression**: The expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT last_value(column_name ORDER BY other_column) FROM table_name;
+-----------------------------------------------+
| last_value(column_name ORDER BY other_column) |
+-----------------------------------------------+
| last_element                                  |
+-----------------------------------------------+
```

##### `max`

Returns the maximum value in the specified column.

```sql
max(expression)
```

###### Arguments

- **expression**: The expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT max(column_name) FROM table_name;
+----------------------+
| max(column_name)      |
+----------------------+
| 150                  |
+----------------------+
```

##### `mean`

_Alias of [avg](#avg)._

##### `median`

Returns the median value in the specified column.

```sql
median(expression)
```

###### Arguments

- **expression**: The expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT median(column_name) FROM table_name;
+----------------------+
| median(column_name)   |
+----------------------+
| 45.5                 |
+----------------------+
```

##### `min`

Returns the minimum value in the specified column.

```sql
min(expression)
```

###### Arguments

- **expression**: The expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT min(column_name) FROM table_name;
+----------------------+
| min(column_name)      |
+----------------------+
| 12                   |
+----------------------+
```

##### `percentile_cont`

Returns the exact percentile of input values, interpolating between values if needed.

```sql
percentile_cont(percentile) WITHIN GROUP (ORDER BY expression)
```

###### Arguments

- **expression**: The expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **percentile**: Percentile to compute. Must be a float value between 0 and 1 (inclusive).

###### Example

```sql
> SELECT percentile_cont(0.75) WITHIN GROUP (ORDER BY column_name) FROM table_name;
+----------------------------------------------------------+
| percentile_cont(0.75) WITHIN GROUP (ORDER BY column_name) |
+----------------------------------------------------------+
| 45.5                                                     |
+----------------------------------------------------------+
```

An alternate syntax is also supported:

```sql
> SELECT percentile_cont(column_name, 0.75) FROM table_name;
+---------------------------------------+
| percentile_cont(column_name, 0.75)    |
+---------------------------------------+
| 45.5                                  |
+---------------------------------------+
```

###### Aliases

- quantile_cont

##### `quantile_cont`

_Alias of [percentile_cont](#percentile_cont)._

##### `string_agg`

Concatenates the values of string expressions and places separator values between them. If ordering is required, strings are concatenated in the specified order. This aggregation function can only mix DISTINCT and ORDER BY if the ordering expression is exactly the same as the first argument expression.

```sql
string_agg([DISTINCT] expression, delimiter [ORDER BY expression])
```

###### Arguments

- **expression**: The string expression to concatenate. Can be a column or any valid string expression.
- **delimiter**: A literal string used as a separator between the concatenated values.

###### Example

```sql
> SELECT string_agg(name, ', ') AS names_list
  FROM employee;
+--------------------------+
| names_list               |
+--------------------------+
| Alice, Bob, Bob, Charlie |
+--------------------------+
> SELECT string_agg(name, ', ' ORDER BY name DESC) AS names_list
  FROM employee;
+--------------------------+
| names_list               |
+--------------------------+
| Charlie, Bob, Bob, Alice |
+--------------------------+
> SELECT string_agg(DISTINCT name, ', ' ORDER BY name DESC) AS names_list
  FROM employee;
+--------------------------+
| names_list               |
+--------------------------+
| Charlie, Bob, Alice |
+--------------------------+
```

##### `sum`

Returns the sum of all values in the specified column.

```sql
sum(expression)
```

###### Arguments

- **expression**: The expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT sum(column_name) FROM table_name;
+-----------------------+
| sum(column_name)       |
+-----------------------+
| 12345                 |
+-----------------------+
```

##### `var`

Returns the statistical sample variance of a set of numbers.

```sql
var(expression)
```

###### Arguments

- **expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Aliases

- var_sample
- var_samp

##### `var_pop`

Returns the statistical population variance of a set of numbers.

```sql
var_pop(expression)
```

###### Arguments

- **expression**: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Aliases

- var_population

##### `var_population`

_Alias of [var_pop](#var_pop)._

##### `var_samp`

_Alias of [var](#var)._

##### `var_sample`

_Alias of [var](#var)._

### Statistical Functions

- [corr](#corr)
- [covar](#covar)
- [covar_pop](#covar_pop)
- [covar_samp](#covar_samp)
- [nth_value](#nth_value)
- [regr_avgx](#regr_avgx)
- [regr_avgy](#regr_avgy)
- [regr_count](#regr_count)
- [regr_intercept](#regr_intercept)
- [regr_r2](#regr_r2)
- [regr_slope](#regr_slope)
- [regr_sxx](#regr_sxx)
- [regr_sxy](#regr_sxy)
- [regr_syy](#regr_syy)
- [stddev](#stddev)
- [stddev_pop](#stddev_pop)
- [stddev_samp](#stddev_samp)

##### `corr`

Returns the coefficient of correlation between two numeric values.

```sql
corr(expression1, expression2)
```

###### Arguments

- **expression1**: First expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **expression2**: Second expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT corr(column1, column2) FROM table_name;
+--------------------------------+
| corr(column1, column2)         |
+--------------------------------+
| 0.85                           |
+--------------------------------+
```

##### `covar`

_Alias of [covar_samp](#covar_samp)._

##### `covar_pop`

Returns the sample covariance of a set of number pairs.

```sql
covar_samp(expression1, expression2)
```

###### Arguments

- **expression1**: First expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **expression2**: Second expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT covar_samp(column1, column2) FROM table_name;
+-----------------------------------+
| covar_samp(column1, column2)      |
+-----------------------------------+
| 8.25                              |
+-----------------------------------+
```

##### `covar_samp`

Returns the sample covariance of a set of number pairs.

```sql
covar_samp(expression1, expression2)
```

###### Arguments

- **expression1**: First expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **expression2**: Second expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT covar_samp(column1, column2) FROM table_name;
+-----------------------------------+
| covar_samp(column1, column2)      |
+-----------------------------------+
| 8.25                              |
+-----------------------------------+
```

###### Aliases

- covar

##### `nth_value`

Returns the nth value in a group of values.

```sql
nth_value(expression, n ORDER BY expression)
```

###### Arguments

- **expression**: The column or expression to retrieve the nth value from.
- **n**: The position (nth) of the value to retrieve, based on the ordering.

###### Example

```sql
> SELECT dept_id, salary, NTH_VALUE(salary, 2) OVER (PARTITION BY dept_id ORDER BY salary ASC) AS second_salary_by_dept
  FROM employee;
+---------+--------+-------------------------+
| dept_id | salary | second_salary_by_dept   |
+---------+--------+-------------------------+
| 1       | 30000  | NULL                    |
| 1       | 40000  | 40000                   |
| 1       | 50000  | 40000                   |
| 2       | 35000  | NULL                    |
| 2       | 45000  | 45000                   |
+---------+--------+-------------------------+
```

##### `regr_avgx`

Computes the average of the independent variable (input) expression_x for the non-null paired data points.

```sql
regr_avgx(expression_y, expression_x)
```

###### Arguments

- **expression_y**: Dependent variable expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **expression_x**: Independent variable expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
create table daily_sales(day int, total_sales int) as values (1,100), (2,150), (3,200), (4,NULL), (5,250);
select * from daily_sales;
+-----+-------------+
| day | total_sales |
| --- | ----------- |
| 1   | 100         |
| 2   | 150         |
| 3   | 200         |
| 4   | NULL        |
| 5   | 250         |
+-----+-------------+

SELECT regr_avgx(total_sales, day) AS avg_day FROM daily_sales;
+----------+
| avg_day  |
+----------+
|   2.75   |
+----------+
```

##### `regr_avgy`

Computes the average of the dependent variable (output) expression_y for the non-null paired data points.

```sql
regr_avgy(expression_y, expression_x)
```

###### Arguments

- **expression_y**: Dependent variable expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **expression_x**: Independent variable expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
create table daily_temperature(day int, temperature int) as values (1,30), (2,32), (3, NULL), (4,35), (5,36);
select * from daily_temperature;
+-----+-------------+
| day | temperature |
| --- | ----------- |
| 1   | 30          |
| 2   | 32          |
| 3   | NULL        |
| 4   | 35          |
| 5   | 36          |
+-----+-------------+

-- temperature as Dependent Variable(Y), day as Independent Variable(X)
SELECT regr_avgy(temperature, day) AS avg_temperature FROM daily_temperature;
+-----------------+
| avg_temperature |
+-----------------+
| 33.25           |
+-----------------+
```

##### `regr_count`

Counts the number of non-null paired data points.

```sql
regr_count(expression_y, expression_x)
```

###### Arguments

- **expression_y**: Dependent variable expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **expression_x**: Independent variable expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
create table daily_metrics(day int, user_signups int) as values (1,100), (2,120), (3, NULL), (4,110), (5,NULL);
select * from daily_metrics;
+-----+---------------+
| day | user_signups  |
| --- | ------------- |
| 1   | 100           |
| 2   | 120           |
| 3   | NULL          |
| 4   | 110           |
| 5   | NULL          |
+-----+---------------+

SELECT regr_count(user_signups, day) AS valid_pairs FROM daily_metrics;
+-------------+
| valid_pairs |
+-------------+
| 3           |
+-------------+
```

##### `regr_intercept`

Computes the y-intercept of the linear regression line. For the equation (y = kx + b), this function returns b.

```sql
regr_intercept(expression_y, expression_x)
```

###### Arguments

- **expression_y**: Dependent variable expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **expression_x**: Independent variable expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
create table weekly_performance(week int, productivity_score int) as values (1,60), (2,65), (3, 70), (4,75), (5,80);
select * from weekly_performance;
+------+---------------------+
| week | productivity_score  |
| ---- | ------------------- |
| 1    | 60                  |
| 2    | 65                  |
| 3    | 70                  |
| 4    | 75                  |
| 5    | 80                  |
+------+---------------------+

SELECT regr_intercept(productivity_score, week) AS intercept FROM weekly_performance;
+----------+
|intercept|
|intercept |
+----------+
|  55      |
+----------+
```

##### `regr_r2`

Computes the square of the correlation coefficient between the independent and dependent variables.

```sql
regr_r2(expression_y, expression_x)
```

###### Arguments

- **expression_y**: Dependent variable expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **expression_x**: Independent variable expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
create table weekly_performance(day int ,user_signups int) as values (1,60), (2,65), (3, 70), (4,75), (5,80);
select * from weekly_performance;
+-----+--------------+
| day | user_signups |
+-----+--------------+
| 1   | 60           |
| 2   | 65           |
| 3   | 70           |
| 4   | 75           |
| 5   | 80           |
+-----+--------------+

SELECT regr_r2(user_signups, day) AS r_squared FROM weekly_performance;
+---------+
|r_squared|
+---------+
| 1.0     |
+---------+
```

##### `regr_slope`

Returns the slope of the linear regression line for non-null pairs in aggregate columns. Given input column Y and X: regr_slope(Y, X) returns the slope (k in Y = k\*X + b) using minimal RSS fitting.

```sql
regr_slope(expression_y, expression_x)
```

###### Arguments

- **expression_y**: Dependent variable expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **expression_x**: Independent variable expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
create table weekly_performance(day int, user_signups int) as values (1,60), (2,65), (3, 70), (4,75), (5,80);
select * from weekly_performance;
+-----+--------------+
| day | user_signups |
+-----+--------------+
| 1   | 60           |
| 2   | 65           |
| 3   | 70           |
| 4   | 75           |
| 5   | 80           |
+-----+--------------+

SELECT regr_slope(user_signups, day) AS slope FROM weekly_performance;
+--------+
| slope  |
+--------+
| 5.0    |
+--------+
```

##### `regr_sxx`

Computes the sum of squares of the independent variable.

```sql
regr_sxx(expression_y, expression_x)
```

###### Arguments

- **expression_y**: Dependent variable expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **expression_x**: Independent variable expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
create table study_hours(student_id int, hours int, test_score int) as values (1,2,55), (2,4,65), (3,6,75), (4,8,85), (5,10,95);
select * from study_hours;
+------------+-------+------------+
| student_id | hours | test_score |
+------------+-------+------------+
| 1          | 2     | 55         |
| 2          | 4     | 65         |
| 3          | 6     | 75         |
| 4          | 8     | 85         |
| 5          | 10    | 95         |
+------------+-------+------------+

SELECT regr_sxx(test_score, hours) AS sxx FROM study_hours;
+------+
| sxx  |
+------+
| 40.0 |
+------+
```

##### `regr_sxy`

Computes the sum of products of paired data points.

```sql
regr_sxy(expression_y, expression_x)
```

###### Arguments

- **expression_y**: Dependent variable expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **expression_x**: Independent variable expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
create table employee_productivity(week int, productivity_score int) as values(1,60), (2,65), (3,70);
select * from employee_productivity;
+------+--------------------+
| week | productivity_score |
+------+--------------------+
| 1    | 60                 |
| 2    | 65                 |
| 3    | 70                 |
+------+--------------------+

SELECT regr_sxy(productivity_score, week) AS sum_product_deviations FROM employee_productivity;
+------------------------+
| sum_product_deviations |
+------------------------+
|       10.0             |
+------------------------+
```

##### `regr_syy`

Computes the sum of squares of the dependent variable.

```sql
regr_syy(expression_y, expression_x)
```

###### Arguments

- **expression_y**: Dependent variable expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **expression_x**: Independent variable expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
create table employee_productivity(week int, productivity_score int) as values (1,60), (2,65), (3,70);
select * from employee_productivity;
+------+--------------------+
| week | productivity_score |
+------+--------------------+
| 1    | 60                 |
| 2    | 65                 |
| 3    | 70                 |
+------+--------------------+

SELECT regr_syy(productivity_score, week) AS sum_squares_y FROM employee_productivity;
+---------------+
| sum_squares_y |
+---------------+
|    50.0       |
+---------------+
```

##### `stddev`

Returns the standard deviation of a set of numbers.

```sql
stddev(expression)
```

###### Arguments

- **expression**: The expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT stddev(column_name) FROM table_name;
+----------------------+
| stddev(column_name)   |
+----------------------+
| 12.34                |
+----------------------+
```

###### Aliases

- stddev_samp

##### `stddev_pop`

Returns the population standard deviation of a set of numbers.

```sql
stddev_pop(expression)
```

###### Arguments

- **expression**: The expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT stddev_pop(column_name) FROM table_name;
+--------------------------+
| stddev_pop(column_name)   |
+--------------------------+
| 10.56                    |
+--------------------------+
```

##### `stddev_samp`

_Alias of [stddev](#stddev)._

### Approximate Functions

- [approx_distinct](#approx_distinct)
- [approx_median](#approx_median)
- [approx_percentile_cont](#approx_percentile_cont)
- [approx_percentile_cont_with_weight](#approx_percentile_cont_with_weight)

##### `approx_distinct`

Returns the approximate number of distinct input values calculated using the HyperLogLog algorithm.

```sql
approx_distinct(expression)
```

###### Arguments

- **expression**: The expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT approx_distinct(column_name) FROM table_name;
+-----------------------------------+
| approx_distinct(column_name)      |
+-----------------------------------+
| 42                                |
+-----------------------------------+
```

##### `approx_median`

Returns the approximate median (50th percentile) of input values. It is an alias of `approx_percentile_cont(0.5) WITHIN GROUP (ORDER BY x)`.

```sql
approx_median(expression)
```

###### Arguments

- **expression**: The expression to operate on. Can be a constant, column, or function, and any combination of operators.

###### Example

```sql
> SELECT approx_median(column_name) FROM table_name;
+-----------------------------------+
| approx_median(column_name)        |
+-----------------------------------+
| 23.5                              |
+-----------------------------------+
```

##### `approx_percentile_cont`

Returns the approximate percentile of input values using the t-digest algorithm.

```sql
approx_percentile_cont(percentile [, centroids]) WITHIN GROUP (ORDER BY expression)
```

###### Arguments

- **expression**: The expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **percentile**: Percentile to compute. Must be a float value between 0 and 1 (inclusive).
- **centroids**: Number of centroids to use in the t-digest algorithm. _Default is 100_. A higher number results in more accurate approximation but requires more memory.

###### Example

```sql
> SELECT approx_percentile_cont(0.75) WITHIN GROUP (ORDER BY column_name) FROM table_name;
+------------------------------------------------------------------+
| approx_percentile_cont(0.75) WITHIN GROUP (ORDER BY column_name) |
+------------------------------------------------------------------+
| 65.0                                                             |
+------------------------------------------------------------------+
> SELECT approx_percentile_cont(0.75, 100) WITHIN GROUP (ORDER BY column_name) FROM table_name;
+-----------------------------------------------------------------------+
| approx_percentile_cont(0.75, 100) WITHIN GROUP (ORDER BY column_name) |
+-----------------------------------------------------------------------+
| 65.0                                                                  |
+-----------------------------------------------------------------------+
```

An alternate syntax is also supported:

```sql
> SELECT approx_percentile_cont(column_name, 0.75) FROM table_name;
+-----------------------------------------------+
| approx_percentile_cont(column_name, 0.75)     |
+-----------------------------------------------+
| 65.0                                          |
+-----------------------------------------------+

> SELECT approx_percentile_cont(column_name, 0.75, 100) FROM table_name;
+----------------------------------------------------------+
| approx_percentile_cont(column_name, 0.75, 100)           |
+----------------------------------------------------------+
| 65.0                                                     |
+----------------------------------------------------------+
```

##### `approx_percentile_cont_with_weight`

Returns the weighted approximate percentile of input values using the t-digest algorithm.

```sql
approx_percentile_cont_with_weight(weight, percentile [, centroids]) WITHIN GROUP (ORDER BY expression)
```

###### Arguments

- **expression**: The expression to operate on. Can be a constant, column, or function, and any combination of operators.
- **weight**: Expression to use as weight. Can be a constant, column, or function, and any combination of arithmetic operators.
- **percentile**: Percentile to compute. Must be a float value between 0 and 1 (inclusive).
- **centroids**: Number of centroids to use in the t-digest algorithm. _Default is 100_. A higher number results in more accurate approximation but requires more memory.

###### Example

```sql
> SELECT approx_percentile_cont_with_weight(weight_column, 0.90) WITHIN GROUP (ORDER BY column_name) FROM table_name;
+---------------------------------------------------------------------------------------------+
| approx_percentile_cont_with_weight(weight_column, 0.90) WITHIN GROUP (ORDER BY column_name) |
+---------------------------------------------------------------------------------------------+
| 78.5                                                                                        |
+---------------------------------------------------------------------------------------------+
> SELECT approx_percentile_cont_with_weight(weight_column, 0.90, 100) WITHIN GROUP (ORDER BY column_name) FROM table_name;
+--------------------------------------------------------------------------------------------------+
| approx_percentile_cont_with_weight(weight_column, 0.90, 100) WITHIN GROUP (ORDER BY column_name) |
+--------------------------------------------------------------------------------------------------+
| 78.5                                                                                             |
+--------------------------------------------------------------------------------------------------+
```

An alternative syntax is also supported:

```sql
> SELECT approx_percentile_cont_with_weight(column_name, weight_column, 0.90) FROM table_name;
+--------------------------------------------------+
| approx_percentile_cont_with_weight(column_name, weight_column, 0.90) |
+--------------------------------------------------+
| 78.5                                             |
+--------------------------------------------------+
```
<!---
  Licensed to the Apache Software Foundation (ASF) under one
  or more contributor license agreements.  See the NOTICE file
  distributed with this work for additional information
  regarding copyright ownership.  The ASF licenses this file
  to you under the Apache License, Version 2.0 (the
  "License"); you may not use this file except in compliance
  with the License.  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing,
  software distributed under the License is distributed on an
  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
  KIND, either express or implied.  See the License for the
  specific language governing permissions and limitations
  under the License.
-->

<!---
This file was generated by the dev/update_function_docs.sh script.
Do not edit it manually as changes will be overwritten.
Instead, edit the WindowUDFImpl's documentation() function to
update documentation for an individual UDF or the
dev/update_function_docs.sh file for updating surrounding text.
-->

## Window Functions

A _window function_ performs a calculation across a set of table rows that are somehow related to the current row.
This is comparable to the type of calculation that can be done with an aggregate function.
However, window functions do not cause rows to become grouped into a single output row like non-window aggregate calls would.
Instead, the rows retain their separate identities. Behind the scenes, the window function is able to access more than just the current row of the query result

Here is an example that shows how to compare each employee's salary with the average salary in his or her department:

```sql
SELECT depname, empno, salary, avg(salary) OVER (PARTITION BY depname) FROM empsalary;

+-----------+-------+--------+-------------------+
| depname   | empno | salary | avg               |
+-----------+-------+--------+-------------------+
| personnel | 2     | 3900   | 3700.0            |
| personnel | 5     | 3500   | 3700.0            |
| develop   | 8     | 6000   | 5020.0            |
| develop   | 10    | 5200   | 5020.0            |
| develop   | 11    | 5200   | 5020.0            |
| develop   | 9     | 4500   | 5020.0            |
| develop   | 7     | 4200   | 5020.0            |
| sales     | 1     | 5000   | 4866.666666666667 |
| sales     | 4     | 4800   | 4866.666666666667 |
| sales     | 3     | 4800   | 4866.666666666667 |
+-----------+-------+--------+-------------------+
```

A window function call always contains an OVER clause directly following the window function's name and argument(s). This is what syntactically distinguishes it from a normal function or non-window aggregate. The OVER clause determines exactly how the rows of the query are split up for processing by the window function. The PARTITION BY clause within OVER divides the rows into groups, or partitions, that share the same values of the PARTITION BY expression(s). For each row, the window function is computed across the rows that fall into the same partition as the current row. The previous example showed how to count the average of a column per partition.

You can also control the order in which rows are processed by window functions using ORDER BY within OVER. (The window ORDER BY does not even have to match the order in which the rows are output.) Here is an example:

```sql
SELECT depname, empno, salary,
       rank() OVER (PARTITION BY depname ORDER BY salary DESC)
FROM empsalary;

+-----------+-------+--------+--------+
| depname   | empno | salary | rank   |
+-----------+-------+--------+--------+
| personnel | 2     | 3900   | 1      |
| develop   | 8     | 6000   | 1      |
| develop   | 10    | 5200   | 2      |
| develop   | 11    | 5200   | 2      |
| develop   | 9     | 4500   | 4      |
| develop   | 7     | 4200   | 5      |
| sales     | 1     | 5000   | 1      |
| sales     | 4     | 4800   | 2      |
| personnel | 5     | 3500   | 2      |
| sales     | 3     | 4800   | 2      |
+-----------+-------+--------+--------+
```

There is another important concept associated with window functions: for each row, there is a set of rows within its partition called its window frame. Some window functions act only on the rows of the window frame, rather than of the whole partition. Here is an example of using window frames in queries:

```sql
SELECT depname, empno, salary,
    avg(salary) OVER(ORDER BY salary ASC ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS avg,
    min(salary) OVER(ORDER BY empno ASC ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS cum_min
FROM empsalary
ORDER BY empno ASC;

+-----------+-------+--------+--------------------+---------+
| depname   | empno | salary | avg                | cum_min |
+-----------+-------+--------+--------------------+---------+
| sales     | 1     | 5000   | 5000.0             | 5000    |
| personnel | 2     | 3900   | 3866.6666666666665 | 3900    |
| sales     | 3     | 4800   | 4700.0             | 3900    |
| sales     | 4     | 4800   | 4866.666666666667  | 3900    |
| personnel | 5     | 3500   | 3700.0             | 3500    |
| develop   | 7     | 4200   | 4200.0             | 3500    |
| develop   | 8     | 6000   | 5600.0             | 3500    |
| develop   | 9     | 4500   | 4500.0             | 3500    |
| develop   | 10    | 5200   | 5133.333333333333  | 3500    |
| develop   | 11    | 5200   | 5466.666666666667  | 3500    |
+-----------+-------+--------+--------------------+---------+
```

When a query involves multiple window functions, it is possible to write out each one with a separate OVER clause, but this is duplicative and error-prone if the same windowing behavior is wanted for several functions. Instead, each windowing behavior can be named in a WINDOW clause and then referenced in OVER. For example:

```sql
SELECT sum(salary) OVER w, avg(salary) OVER w
FROM empsalary
WINDOW w AS (PARTITION BY depname ORDER BY salary DESC);
```

### Syntax

The syntax for the OVER-clause is

```sql
function([expr])
  OVER(
    [PARTITION BY expr[, …]]
    [ORDER BY expr [ ASC | DESC ][, …]]
    [ frame_clause ]
    )
```

where **frame_clause** is one of:

```sql
  { RANGE | ROWS | GROUPS } frame_start
  { RANGE | ROWS | GROUPS } BETWEEN frame_start AND frame_end
```

and **frame_start** and **frame_end** can be one of

```sql
UNBOUNDED PRECEDING
offset PRECEDING
CURRENT ROW
offset FOLLOWING
UNBOUNDED FOLLOWING
```

where **offset** is an non-negative integer.

RANGE and GROUPS modes require an ORDER BY clause (with RANGE the ORDER BY must specify exactly one column).

### Filter clause for aggregate window functions

Aggregate window functions support the SQL `FILTER (WHERE ...)` clause to include only rows that satisfy the predicate from the window frame in the aggregation.

```sql
sum(salary) FILTER (WHERE salary > 0)
  OVER (PARTITION BY depname ORDER BY salary ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
```

If no rows in the frame satisfy the filter for a given output row, `COUNT` yields `0` while `SUM`/`AVG`/`MIN`/`MAX` yield `NULL`.

### Aggregate functions

All [aggregate functions](#aggregate-functions) can be used as window functions.

### Ranking Functions

- [cume_dist](#cume_dist)
- [dense_rank](#dense_rank)
- [ntile](#ntile)
- [percent_rank](#percent_rank)
- [rank](#rank)
- [row_number](#row_number)

##### `cume_dist`

Relative rank of the current row: (number of rows preceding or peer with the current row) / (total rows).

```sql
cume_dist()
```

###### Example

```sql
-- Example usage of the cume_dist window function:
SELECT salary,
    cume_dist() OVER (ORDER BY salary) AS cume_dist
FROM employees;

+--------+-----------+
| salary | cume_dist |
+--------+-----------+
| 30000  | 0.33      |
| 50000  | 0.67      |
| 70000  | 1.00      |
+--------+-----------+
```

##### `dense_rank`

Returns the rank of the current row without gaps. This function ranks rows in a dense manner, meaning consecutive ranks are assigned even for identical values.

```sql
dense_rank()
```

###### Example

```sql
-- Example usage of the dense_rank window function:
SELECT department,
    salary,
    dense_rank() OVER (PARTITION BY department ORDER BY salary DESC) AS dense_rank
FROM employees;

+-------------+--------+------------+
| department  | salary | dense_rank |
+-------------+--------+------------+
| Sales       | 70000  | 1          |
| Sales       | 50000  | 2          |
| Sales       | 50000  | 2          |
| Sales       | 30000  | 3          |
| Engineering | 90000  | 1          |
| Engineering | 80000  | 2          |
+-------------+--------+------------+
```

##### `ntile`

Integer ranging from 1 to the argument value, dividing the partition as equally as possible

```sql
ntile(expression)
```

###### Arguments

- **expression**: An integer describing the number groups the partition should be split into

###### Example

```sql
-- Example usage of the ntile window function:
SELECT employee_id,
    salary,
    ntile(4) OVER (ORDER BY salary DESC) AS quartile
FROM employees;

+-------------+--------+----------+
| employee_id | salary | quartile |
+-------------+--------+----------+
| 1           | 90000  | 1        |
| 2           | 85000  | 1        |
| 3           | 80000  | 2        |
| 4           | 70000  | 2        |
| 5           | 60000  | 3        |
| 6           | 50000  | 3        |
| 7           | 40000  | 4        |
| 8           | 30000  | 4        |
+-------------+--------+----------+
```

##### `percent_rank`

Returns the percentage rank of the current row within its partition. The value ranges from 0 to 1 and is computed as `(rank - 1) / (total_rows - 1)`.

```sql
percent_rank()
```

###### Example

```sql
    -- Example usage of the percent_rank window function:
SELECT employee_id,
    salary,
    percent_rank() OVER (ORDER BY salary) AS percent_rank
FROM employees;

+-------------+--------+---------------+
| employee_id | salary | percent_rank  |
+-------------+--------+---------------+
| 1           | 30000  | 0.00          |
| 2           | 50000  | 0.50          |
| 3           | 70000  | 1.00          |
+-------------+--------+---------------+
```

##### `rank`

Returns the rank of the current row within its partition, allowing gaps between ranks. This function provides a ranking similar to `row_number`, but skips ranks for identical values.

```sql
rank()
```

###### Example

```sql
-- Example usage of the rank window function:
SELECT department,
    salary,
    rank() OVER (PARTITION BY department ORDER BY salary DESC) AS rank
FROM employees;

+-------------+--------+------+
| department  | salary | rank |
+-------------+--------+------+
| Sales       | 70000  | 1    |
| Sales       | 50000  | 2    |
| Sales       | 50000  | 2    |
| Sales       | 30000  | 4    |
| Engineering | 90000  | 1    |
| Engineering | 80000  | 2    |
+-------------+--------+------+
```

##### `row_number`

Number of the current row within its partition, counting from 1.

```sql
row_number()
```

###### Example

```sql
-- Example usage of the row_number window function:
SELECT department,
  salary,
  row_number() OVER (PARTITION BY department ORDER BY salary DESC) AS row_num
FROM employees;

+-------------+--------+---------+
| department  | salary | row_num |
+-------------+--------+---------+
| Sales       | 70000  | 1       |
| Sales       | 50000  | 2       |
| Sales       | 50000  | 3       |
| Sales       | 30000  | 4       |
| Engineering | 90000  | 1       |
| Engineering | 80000  | 2       |
+-------------+--------+---------+
```

### Analytical Functions

- [first_value](#first_value)
- [lag](#lag)
- [last_value](#last_value)
- [lead](#lead)
- [nth_value](#nth_value)

##### `first_value`

Returns value evaluated at the row that is the first row of the window frame.

```sql
first_value(expression)
```

###### Arguments

- **expression**: Expression to operate on

###### Example

```sql
-- Example usage of the first_value window function:
SELECT department,
  employee_id,
  salary,
  first_value(salary) OVER (PARTITION BY department ORDER BY salary DESC) AS top_salary
FROM employees;

+-------------+-------------+--------+------------+
| department  | employee_id | salary | top_salary |
+-------------+-------------+--------+------------+
| Sales       | 1           | 70000  | 70000      |
| Sales       | 2           | 50000  | 70000      |
| Sales       | 3           | 30000  | 70000      |
| Engineering | 4           | 90000  | 90000      |
| Engineering | 5           | 80000  | 90000      |
+-------------+-------------+--------+------------+
```

##### `lag`

Returns value evaluated at the row that is offset rows before the current row within the partition; if there is no such row, instead return default (which must be of the same type as value).

```sql
lag(expression, offset, default)
```

###### Arguments

- **expression**: Expression to operate on
- **offset**: Integer. Specifies how many rows back the value of expression should be retrieved. Defaults to 1.
- **default**: The default value if the offset is not within the partition. Must be of the same type as expression.

###### Example

```sql
-- Example usage of the lag window function:
SELECT employee_id,
    salary,
    lag(salary, 1, 0) OVER (ORDER BY employee_id) AS prev_salary
FROM employees;

+-------------+--------+-------------+
| employee_id | salary | prev_salary |
+-------------+--------+-------------+
| 1           | 30000  | 0           |
| 2           | 50000  | 30000       |
| 3           | 70000  | 50000       |
| 4           | 60000  | 70000       |
+-------------+--------+-------------+
```

##### `last_value`

Returns value evaluated at the row that is the last row of the window frame.

```sql
last_value(expression)
```

###### Arguments

- **expression**: Expression to operate on

###### Example

```sql
-- SQL example of last_value:
SELECT department,
       employee_id,
       salary,
       last_value(salary) OVER (PARTITION BY department ORDER BY salary) AS running_last_salary
FROM employees;

+-------------+-------------+--------+---------------------+
| department  | employee_id | salary | running_last_salary |
+-------------+-------------+--------+---------------------+
| Sales       | 1           | 30000  | 30000               |
| Sales       | 2           | 50000  | 50000               |
| Sales       | 3           | 70000  | 70000               |
| Engineering | 4           | 40000  | 40000               |
| Engineering | 5           | 60000  | 60000               |
+-------------+-------------+--------+---------------------+
```

##### `lead`

Returns value evaluated at the row that is offset rows after the current row within the partition; if there is no such row, instead return default (which must be of the same type as value).

```sql
lead(expression, offset, default)
```

###### Arguments

- **expression**: Expression to operate on
- **offset**: Integer. Specifies how many rows forward the value of expression should be retrieved. Defaults to 1.
- **default**: The default value if the offset is not within the partition. Must be of the same type as expression.

###### Example

```sql
-- Example usage of lead window function:
SELECT
    employee_id,
    department,
    salary,
    lead(salary, 1, 0) OVER (PARTITION BY department ORDER BY salary) AS next_salary
FROM employees;

+-------------+-------------+--------+--------------+
| employee_id | department  | salary | next_salary  |
+-------------+-------------+--------+--------------+
| 1           | Sales       | 30000  | 50000        |
| 2           | Sales       | 50000  | 70000        |
| 3           | Sales       | 70000  | 0            |
| 4           | Engineering | 40000  | 60000        |
| 5           | Engineering | 60000  | 0            |
+-------------+-------------+--------+--------------+
```

##### `nth_value`

Returns the value evaluated at the nth row of the window frame (counting from 1). Returns NULL if no such row exists.

```sql
nth_value(expression, n)
```

###### Arguments

- **expression**: The column from which to retrieve the nth value.
- **n**: Integer. Specifies the row number (starting from 1) in the window frame.

###### Example

```sql
-- Sample employees table:
CREATE TABLE employees (id INT, salary INT);
INSERT INTO employees (id, salary) VALUES
(1, 30000),
(2, 40000),
(3, 50000),
(4, 60000),
(5, 70000);

-- Example usage of nth_value:
SELECT nth_value(salary, 2) OVER (
  ORDER BY salary
  ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS nth_value
FROM employees;

+-----------+
| nth_value |
+-----------+
| 40000     |
| 40000     |
| 40000     |
| 40000     |
| 40000     |
+-----------+
```
<!---
  Licensed to the Apache Software Foundation (ASF) under one
  or more contributor license agreements.  See the NOTICE file
  distributed with this work for additional information
  regarding copyright ownership.  The ASF licenses this file
  to you under the Apache License, Version 2.0 (the
  "License"); you may not use this file except in compliance
  with the License.  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing,
  software distributed under the License is distributed on an
  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
  KIND, either express or implied.  See the License for the
  specific language governing permissions and limitations
  under the License.
-->

## Special Functions

### Expansion Functions

- [unnest](#unnest)
- [unnest(struct)](#unnest-struct)

##### `unnest`

Expands an array or map into rows.

###### Arguments

- **array**: Array expression to unnest.
  Can be a constant, column, or function, and any combination of array operators.

###### Examples

```sql
> select unnest(make_array(1, 2, 3, 4, 5)) as unnested;
+----------+
| unnested |
+----------+
| 1        |
| 2        |
| 3        |
| 4        |
| 5        |
+----------+
```

```sql
> select unnest(range(0, 10)) as unnested_range;
+----------------+
| unnested_range |
+----------------+
| 0              |
| 1              |
| 2              |
| 3              |
| 4              |
| 5              |
| 6              |
| 7              |
| 8              |
| 9              |
+----------------+
```

##### `unnest (struct)`

Expand a struct fields into individual columns.
Each field of the struct will be prefixed with `__unnest_placeholder` and could be accessed via `"__unnest_placeholder(<struct>).<field>"`.

###### Arguments

- **struct**: Object expression to unnest.
  Can be a constant, column, or function, and any combination of object operators.

###### Examples

```sql
> create table foo as values ({a: 5, b: 'a string'}), ({a:6, b: 'another string'});

> create view foov as select column1 as struct_column from foo;

> select * from foov;
+---------------------------+
| struct_column             |
+---------------------------+
| {a: 5, b: a string}       |
| {a: 6, b: another string} |
+---------------------------+

> select unnest(struct_column) from foov;
+--------------------------------------------+--------------------------------------------+
| __unnest_placeholder(foov.struct_column).a | __unnest_placeholder(foov.struct_column).b |
+--------------------------------------------+--------------------------------------------+
| 5                                          | a string                                   |
| 6                                          | another string                             |
+--------------------------------------------+--------------------------------------------+
```