Release 0.288¶
Warning
The tarball package presto-server-0.288.tar.gz
does not include the Presto Console. This has been fixed in the patch version 0.288.1
. #23327
Highlights¶
Improve handling of floating point numbers in Presto to consistently treat NaNs as larger than any other number and equal to itself. It also changes the handling of positive and negative zero to always be considered equal to each other. For more information, see RFC-0001-nan-definition.md. The new nan behavior can be disabled by setting the configuration property use-new-nan-definition to false. This configuration property is intended to be temporary to ease migration in the short term, and will be removed in a future release. #22386
Add procedure expire_snapshots to remove old snapshots in Iceberg. #22609
Add support for Iceberg REST catalog. #22417
Add support for
NOT NULL
column constraints in the CREATE TABLE and ALTER TABLE statements. This only takes effect for Hive connector now. #22064
Details¶
General Changes¶
Fix CAST of REAL values outside of BIGINT range to return an exception with an INVALID_CAST_ARGUMENT error code. Previously they would silently overflow. #22917
Fix HBO to skip tracking of stats for plan nodes affected by dynamic filter pushdown in presto cpp. #22853
Fix a bug where
map_top_n()
could return wrong results if there is any NaN input. #22386Fix a bug with array_min/array_max where it would return NaN rather than null when there was both NaN and null input. #22386
Fix an error for some queries using a mix of joins and semi-joins when grouped execution is enabled. #22538
Fix
array_join()
to not add a trailing delimeter when the last element in the array is NULL. #22652Fix cast of NaN and Infinity from DOUBLE or REAL to BIGINT, INTEGER, SMALLINT, and TINYINT. It will now return an exception with the INVALID_CAST_ARGUMENT error code. Previously it would return zero. #22917
Fix compilation error for queries with lambda in aggregation function. #22539
Fix incorrect behaviors when defining duplicate field names in RowType and throw exception uniformly. #22618
Fix wrong results for
regr_r2()
. #22611Fix the latency regression for queries with large IN clause. #22661
Fix wrong results when queries using materialized CTEs have multiple common filters pushed into the CTE. #22700
Improve EXPLAIN ANALYZE statement to support a
format
argument with values of<TEXT|JSON>
. #22733Improve README.md and CONTRIBUTING.md. #22918
Improve configuring worker threads relative to core count by setting the
task.max-worker-threads
configuration property to<multiplier>C
. For example, setting the property to2C
configures the worker thread pool to create up to twice as many threads as there are cores available on a machine. #22809Improve logging for RowExpressionRewriteRuleSet and StatsRecordingPlanOptimizer optimizers to include more information. #22765
Improve session property
property-use_broadcast_when_buildsize_small_probeside_unknown
to do broadcast join when probe side size is unknown and build side estimation from HBO is small. #22681Improve the estimation stats recorded during query optimization. #22769
Improve Presto C++ documentation. #22717
Improve error code for cast from DOUBLE or REAL to BIGINT, INTEGER, SMALLINT or TINYINT for out of range values from NUMERIC_VALUE_OUT_OF_RANGE to INVALID_CAST_ARGUMENT. #22917
Improve handling of floating point numbers in Presto to consistently treat NaNs as larger than any other number and equal to itself. It also changes the handling of positive and negative zero to always be considered equal to each other. Read more here: https://github.com/prestodb/rfcs/blob/main/RFC-0001-nan-definition.md. The new nan behavior can be disabled by setting the configuration property use-new-nan-definition to false. This configuration property is intended to be temporary to ease migration in the short term, and will be removed in a future release. #22386
Improve the performance of reading common table expressions (CTE). #22478
Improve join performance by prefiltering the build side with distinct keys from the probe side. This can be enabled with the
join_prefilter_build_side
session property. #22667Add HBO for CTE materialized query. #22606
Add support for CTAS on bucketed (but not partitioned) tables for Presto C++ clusters. #22737
Add support for
NOT NULL
column constraints in the CREATE TABLE and ALTER TABLE statements. This only takes effect for Hive connector now. #22064Add Presto C++ Properties Reference documentation. #22885
Add PR number to the release note entry examples in pull_request_template.md. #22665
Add
http-server.authentication.allow-forwarded-https
configuration property to recognize X-Forwarded-Proto header. #22492Add
node-scheduler.max-preferred-nodes
configuration property to allow changing number of preferred nodes when soft affinity scheduling is enabled. #22562Add documentation for
noisy_approx_set_sfm_from_index_and_zeros()
. #22799Add documentation for noisy aggregate functions at Noisy Aggregate Functions, including
noisy_approx_distinct_sfm()
andnoisy_approx_set_sfm()
. #22715Add support for memoizing in resource group state info endpoint. This can be enabled by setting
cluster-resource-group-state-info-expiration-duration
to a non-zero duration. #22764Add support for non default keystore and truststore type in presto CLI and JDBC. #22556
Add support for querying system.runtime.tasks table in Presto C++ clusters. #21416
Add two system configuration properties to specify the reserved query memory capacity on Presto C++ clusters:
query-reserved-memory-gb
is the total amount of memory in GB reserved for the queries on a worker node.memory-pool-reserved-capacity
is the amount of memory in bytes reserved for each query. #22593Replace the Presto native stats definition and reporting for the memory allocator, in-memory cache and ssd cache metrics from Presto repo to Velox repo, with the metrics names changing from presto_cpp.<metrics_name> to velox.<metrics_name>. #22751
Remove deprecated feature and configuration property
deprecated.group-by-uses-equal
, which allowed group by to use equal to rather than distinct semantics. #22888Upgrade CI pipeline to build and publish Presto C++ worker docker image. #22806
Upgrade Alluxio to 313. #22958
Upgrade io.jsonwebtoken artifacts to 0.11.5. #22762
Upgrade fasterxml.jackson artifacts to 2.11. #22417
Hive Connector Changes¶
Fix hash calculation for Timestamp column to be hive compatible when writing to a table bucketed by Timestamp. #22980
Improve affinity scheduling granularity from a file to a section of a file by adding a
hive.affinity-scheduling-file-section-size
configuration property andaffinity_scheduling_file_section_size
session property. The default file size is 256MB. #22563Add AWS Security Mapping to allow flexible mapping of Presto Users to AWS Credentials or IAM Roles for different AWS Services. #21622
Add config property
hive.legacy-timestamp-bucketing
and session propertyhive.legacy_timestamp_bucketing
to use the original hash function for Timestamp column, which is not hive compatible. #22980Add support for
NOT NULL
column constraints in the CREATE TABLE and ALTER TABLE statements for the Hive connector. #22064
Iceberg Connector Changes¶
Improve the partition specs that must be checked to determine if the partition supports metadata deletion or predicate thoroughly pushdown. #22753
Improve time travel
TIMESTAMP (SYSTEM_TIME)
syntax to include timestamp-with-time-zone data type. #22851Improve time travel
VERSION (SYSTEM_VERSION)
syntax to include snapshot id using BIGINT data type. #22851Add procedure expire_snapshots to remove old snapshots in Iceberg. #22609
Add support for Iceberg REST catalog. #22417
Add time travel
BEFORE
syntax for Iceberg tables to return historical data. #22851Add support for metadata delete with predicate on non-identity partition columns when they align with partitioning boundaries. #22554
Remove timestamp with time zone in
CREATE
,ALTER
, andINSERT
statements. #22926Add configuration of Iceberg split manager threads using the iceberg.split-manager-threads configuration property. #22754
Verifier Changes¶
Add support for function call substitution based on the specified substitution pattern passed by the parameter –function-substitutes. #22783
SPI Changes¶
Add runtime stats as parameter to
ConnectorPageSourceProvider
. #22960
Credits¶
8dukongjian, Abhisek Saikia, Ajay Gupte, Amit Dutta, Andrii Rosa, Beinan Wang, Christian Zentgraf, Deepak Majeti, Denodo Research Labs, Elliotte Rusty Harold, Emanuel F, Fazal Majid, Feilong Liu, Ge Gao, Jalpreet Singh Nanda (:imjalpreet), Jialiang Tan, Jimmy Lu, Jonathan Hehir, Karteekmurthys, Ke, Kevin Wilfong, Konjac Huang, Linsong Wang, Michael Shang, Neerad Somanchi, Nidhin Varghese, Nikhil Collooru, Pranjal Shankhdhar, Rebecca Schlussel, Reetika Agrawal, Rohit Jain, Sean Yeh, Sergey Pershin, Sergii Druzkin, Sreeni Viswanadha, Steve Burnett, Swapnil Tailor, Tishyaa Chaudhry, Vivek, Vivian Hsu, Wills Feng, Yedidya Feldblum, Yihao Zhou, Yihong Wang, Ying Su, Zac Blanco, Zac Wen, abhinavmuk04, aditi-pandit, deepthydavis, jackychen718, jaystarshot, kiersten-stokes, wangd, wypb, xiaoxmeng, ymmarissa