Plugin Loaded Functions¶
These functions are optional, opt-in functions that can be loaded as needed. For more details on loading these functions, refer to the presto-sql-helpers README.
Array Functions¶
- array_intersect(array(array(E))) -> array(E)¶
Returns an array of the elements in the intersection of all arrays in the given array, without duplicates. This function uses
IS NOT DISTINCT FROMto determine which elements are the same.SELECT array_intersect(ARRAY[ARRAY[1, 2, 3, 2, null], ARRAY[1, 2, 2, 4, null], ARRAY [1, 2, 3, 4, null]]) -- ARRAY[1, 2, null]
- array_average(array(double)) -> double()¶
Returns the average of all non-null elements of the
array. If there are no non-null elements, returnsnull.
- array_split_into_chunks(array(T), int) -> array(array(T))¶
Returns an
arrayof arrays splitting the inputarrayinto chunks of given length. The last chunk will be shorter than the chunk length if the array’s length is not an integer multiple of the chunk length. Ignores null inputs, but not elements.SELECT array_split_into_chunks(ARRAY [1, 2, 3, 4], 3); -- [[1, 2, 3], [4]] SELECT array_split_into_chunks(null, null); -- null SELECT array_split_into_chunks(array[1, 2, 3, cast(null as int)], 2); -- [[1, 2], [3, null]]
- array_frequency(array(E)) -> map(E, int)¶
Returns a map: keys are the unique elements in the
array, values are how many times the key appears. Ignores null elements. Empty array returns empty map.
- array_duplicates(array(T)) -> array(bigint/varchar)¶
Returns a set of elements that occur more than once in
array. Throws an exception if any of the elements are rows or arrays that contain nulls.SELECT array_duplicates(ARRAY[1, 2, null, 1, null, 3]) -- ARRAY[1, null] SELECT array_duplicates(ARRAY[ROW(1, null), ROW(1, null)]) -- "map key cannot be null or contain nulls"
- array_has_duplicates(array(T)) -> boolean()¶
Returns a boolean: whether
arrayhas any elements that occur more than once. Throws an exception if any of the elements are rows or arrays that contain nulls.SELECT array_has_duplicates(ARRAY[1, 2, null, 1, null, 3]) -- true SELECT array_has_duplicates(ARRAY[ROW(1, null), ROW(1, null)]) -- "map key cannot be null or contain nulls"
- array_least_frequent(array(T)) -> array(T)¶
Returns the least frequent non-null element of an array. If there are multiple elements with the same frequency, the function returns the smallest element. If the array has more than one element and any elements are
ROWSwith null fields orARRAYSwith null elements, an exception is returned.SELECT array_least_frequent(ARRAY[1, 0 , 5]) -- ARRAY[0] select array_least_frequent(ARRAY[1, null, 1]) -- ARRAY[1] select array_least_frequent(ARRAY[ROW(1,null), ROW(1, null)]) -- "map key cannot be null or contain nulls"
- array_least_frequent(array(T), n) -> array(T)¶
Returns
nleast frequent non-null elements of an array. The elements are ordered in increasing order of their frequencies. If two elements have the same frequency, smaller elements will appear first. If the array has more than one element and any elements areROWSwith null fields orARRAYSwith null elements, an exception is returned.SELECT array_least_frequent(ARRAY[3, 2, 2, 6, 6, 1, 1], 3) -- ARRAY[3, 1, 2] select array_least_frequent(ARRAY[1, null, 1], 2) -- ARRAY[1] select array_least_frequent(ARRAY[ROW(1,null), ROW(1, null)], 2) -- "map key cannot be null or contain nulls"
- array_max_by(array(T), function(T, U)) -> T()¶
Applies the provided function to each element, and returns the element that gives the maximum value.
Ucan be any orderable type.SELECT array_max_by(ARRAY ['a', 'bbb', 'cc'], x -> LENGTH(x)) -- 'bbb'
- array_min_by(array(T), function(T, U)) -> T()¶
Applies the provided function to each element, and returns the element that gives the minimum value.
Ucan be any orderable type.SELECT array_min_by(ARRAY ['a', 'bbb', 'cc'], x -> LENGTH(x)) -- 'a'
- array_sort_desc(x) -> array()¶
Returns the
arraysorted in the descending order. Elements of thearraymust be orderable. Null elements are placed at the end of the returned array.SELECT array_sort_desc(ARRAY [100, 1, 10, 50]); -- [100, 50, 10, 1] SELECT array_sort_desc(ARRAY [null, 100, null, 1, 10, 50]); -- [100, 50, 10, 1, null, null] SELECT array_sort_desc(ARRAY [ARRAY ["a", null], null, ARRAY ["a"]]); -- [["a", null], ["a"], null]
- remove_nulls(array(T)) -> array()¶
Remove all null elements in the array.
- array_top_n(array(T), int) -> array(T)¶
Returns an array of the top
nelements from a givenarray, sorted according to its natural descending order. Ifnis larger than the size of the givenarray, the returned list will be the same size as the input instead ofn.SELECT array_top_n(ARRAY [1, 100, 2, 5, 3], 3); -- [100, 5, 3] SELECT array_top_n(ARRAY [1, 100], 5); -- [100, 1] SELECT array_top_n(ARRAY ['a', 'zzz', 'zz', 'b', 'g', 'f'], 3); -- ['zzz', 'zz', 'g']
- array_top_n(array(T), int, function(T, T, int)) -> array(T)¶
Returns an array of the top
nelements from a givenarrayusing the specified comparatorfunction. The comparator will take two nullable arguments representing two nullable elements of thearray. It returns -1, 0, or 1 as the first nullable element is less than, equal to, or greater than the second nullable element. If the comparator function returns other values (includingNULL), the query will fail and raise an error. Ifnis larger than the size of the givenarray, the returned list will be the same size as the input instead ofn.SELECT array_top_n(ARRAY [100, 1, 3, -10, 6, -5], 3, (x, y) -> IF(abs(x) < abs(y), -1, IF(abs(x) = abs(y), 0, 1))); -- [100, -10, 6] SELECT array_top_n(ARRAY [CAST(ROW(1, 2) AS ROW(x INT, y INT)), CAST(ROW(0, 11) AS ROW(x INT, y INT)), CAST(ROW(5, 10) AS ROW(x INT, y INT))], 2, (a, b) -> IF(a.x*a.y < b.x*b.y, -1, IF(a.x*a.y = b.x*b.y, 0, 1))); -- [ROW(5, 10), ROW(1, 2)]
- array_transpose(array(array(T))) -> array(array(T))¶
Returns a transpose of a 2D array (matrix), where rows become columns and columns become rows. Converts
a[x][y]totranspose(a)[y][x]. All rows in the input array must have the same length, otherwise the function will fail with an error. Returns an empty array if the input is empty or if all rows are empty.SELECT array_transpose(ARRAY [ARRAY [1, 2, 3], ARRAY [4, 5, 6]]) -- [[1, 4], [2, 5], [3, 6]] SELECT array_transpose(ARRAY [ARRAY ['a', 'b'], ARRAY ['c', 'd'], ARRAY ['e', 'f']]) -- [['a', 'c', 'e'], ['b', 'd', 'f']] SELECT array_transpose(ARRAY [ARRAY [1]]) -- [[1]] SELECT array_transpose(ARRAY []) -- []
Map Functions¶
- map_normalize(x(varchar, double)) -> map(varchar, double)¶
Returns the map with the same keys but all non-null values are scaled proportionally so that the sum of values becomes 1. Map entries with null values remain unchanged.
- map_keys_by_top_n_values(x(K, V), n) -> array(K)¶
Returns top
nkeys in the mapxby sorting its values in descending order. If two or more keys have equal values, the higher key takes precedence.nmust be a non-negative integer.:SELECT map_keys_by_top_n_values(map(ARRAY['a', 'b', 'c'], ARRAY[2, 1, 3]), 2) --- ['c', 'a']
- map_key_exists(x(K, V), k) -> boolean()¶
Returns whether the given key exists in the map. Returns
trueif key is present in the input map, returnsfalseif not present.:SELECT map_key_exists(MAP(ARRAY['x','y'], ARRAY[100,200]), 'x'); -- TRUE
- map_top_n(x(K, V), n) -> map(K, V)¶
Truncates map items. Keeps only the top
nelements by value. Keys are used to break ties with the max key being chosen. Both keys and values should be orderable.nmust be a non-negative integer.SELECT map_top_n(map(ARRAY['a', 'b', 'c'], ARRAY[2, 3, 1]), 2) --- {'b' -> 3, 'a' -> 2}
- map_top_n_keys(x(K, V), n) -> array(K)¶
Returns top
nkeys in the mapxby sorting its keys in descending order.nmust be a non-negative integer.For bottom
nkeys, use the function with lambda operator to perform custom sorting.SELECT map_top_n_keys(map(ARRAY['a', 'b', 'c'], ARRAY[3, 2, 1]), 2) --- ['c', 'b']
- map_top_n_keys(x(K, V), n, function(K, K, int)) -> array(K)¶
Returns top
nkeys in the mapxby sorting its keys using the given comparatorfunction. The comparator takes two non-nullable arguments representing two keys of themap. It returns -1, 0, or 1 as the first key is less than, equal to, or greater than the second key. If the comparator function returns other values (includingNULL), the query will fail and raise an error.SELECT map_top_n_keys(map(ARRAY['a', 'b', 'c'], ARRAY[3, 2, 1]), 2, (x, y) -> IF(x < y, -1, IF(x = y, 0, 1))) --- ['c', 'b']
- map_top_n_values(x(K, V), n) -> array(V)¶
Returns top
nvalues in the mapxby sorting its values in descending order.nmust be a non-negative integer.SELECT map_top_n_values(map(ARRAY['a', 'b', 'c'], ARRAY[1, 2, 3]), 2) --- [3, 2]
- map_top_n_values(x(K, V), n, function(V, V, int)) -> array(V)¶
Returns top n values in the map
xbased on the given comparatorfunction. The comparator will take two nullable arguments representing two values of themap. It returns -1, 0, or 1 as the first value is less than, equal to, or greater than the second value. If the comparator function returns other values (includingNULL), the query will fail and raise an error.SELECT map_top_n_values(map(ARRAY['a', 'b', 'c'], ARRAY[1, 2, 3]), 2, (x, y) -> IF(x < y, -1, IF(x = y, 0, 1))) --- [3, 2]
- map_remove_null_values(x(K, V)) -> map(K, V)¶
Removes all the entries where the value is null from the map
x.
- all_keys_match(x(K,V), function(K, boolean)) -> boolean()¶
Returns whether all keys of a map match the given predicate. Returns true if all the keys match the predicate (a special case is when the map is empty); false if one or more keys don’t match; NULL if the predicate function returns NULL for one or more keys and true for all other keys.
SELECT all_keys_match(map(array['a', 'b', 'c'], array[1, 2, 3]), x -> length(x) = 1); -- true
- any_keys_match(x(K,V), function(K, boolean)) -> boolean()¶
Returns whether any keys of a map match the given predicate. Returns true if one or more keys match the predicate; false if none of the keys match (a special case is when the map is empty); NULL if the predicate function returns NULL for one or more keys and false for all other keys.
SELECT any_keys_match(map(array['a', 'b', 'c'], array[1, 2, 3]), x -> x = 'a'); -- true
- any_values_match(x(K,V), function(V, boolean)) -> boolean()¶
Returns whether any values of a map matches the given predicate. Returns true if one or more values match the predicate; false if none of the values match (a special case is when the map is empty); NULL if the predicate function returns NULL for one or more values and false for all other values.
SELECT ANY_VALUES_MATCH(map(ARRAY['a', 'b', 'c'], ARRAY[1, 2, 3]), x -> x = 1); -- true
- no_keys_match(x(K,V), function(K, boolean)) -> boolean()¶
Returns whether no keys of a map match the given predicate. Returns true if none of the keys match the predicate (a special case is when the map is empty); false if one or more keys match; NULL if the predicate function returns NULL for one or more keys and false for all other keys.
SELECT no_keys_match(map(array['a', 'b', 'c'], array[1, 2, 3]), x -> x = 'd'); -- true
- no_values_match(x(K,V), function(V, boolean)) -> boolean()¶
Returns whether no values of a map match the given predicate. Returns true if none of the values match the predicate (a special case is when the map is empty); false if one or more values match; NULL if the predicate function returns NULL for one or more values and false for all other values.
SELECT no_values_match(map(array['a', 'b', 'c'], array[1, 2, 3]), x -> x = 'd'); -- true
- map_int_keys_to_array(map(int, V)) -> array(V)¶
Returns an
arrayof values from themapwith value at indexed by the original keys frommap.SELECT MAP_INT_KEYS_TO_ARRAY(MAP(ARRAY[3, 5, 6, 9], ARRAY['a', 'b', 'c', 'd'])) -> ARRAY[null, null, 'a', null, 'b', 'c', null, null, 'd'] SELECT MAP_INT_KEYS_TO_ARRAY(MAP(ARRAY[3, 5, 6, 9], ARRAY['a', null, 'c', 'd'])) -> ARRAY[null, null, 'a', null, null, 'c', 'd']
- array_to_map_int_keys(array(v)) -> map(int, v)¶
Returns an
mapwith indices of all non-null values from thearrayas keys and element at the specified index as the value.SELECT ARRAY_TO_MAP_INT_KEYS(CAST(ARRAY[3, 5, 6, 9] AS ARRAY<INT>)) -> MAP(ARRAY[1, 2, 3,4], ARRAY[3, 5, 6, 9]) SELECT ARRAY_TO_MAP_INT_KEYS(CAST(ARRAY[3, 5, null, 6, 9] AS ARRAY<INT>)) -> MAP(ARRAY[1, 2, 4, 5], ARRAY[3, 5, 6, 9]) SELECT ARRAY_TO_MAP_INT_KEYS(CAST(ARRAY[3, 5, null, 6, 9, null, null, 1] AS ARRAY<INT>)) -> MAP(ARRAY[1, 2, 4, 5, 8], ARRAY[3, 5, 6, 9, 1])
String Functions¶
- replace_first(string, search, replace) -> varchar()¶
Replaces the first instance of
searchwithreplaceinstring.If
searchis an empty string, it insertsreplaceat the beginning of thestring.
- trail(string, N) -> varchar()¶
Returns the last N characters of the input string.
- key_sampling_percent(varchar) -> double()¶
Generates a double value between 0.0 and 1.0 based on the hash of the given
varchar. This function is useful for deterministic sampling of data.