Release 0.292¶
Highlights¶
Improve error handling of
INTERVAL DAY
,INTERVAL HOUR
, andINTERVAL SECOND
operators when experiencing overflows. #24353Improve presto router UI. #24411
Upgrade bootstrap to version 5. #24167
Add Java and Native Arrow Flight connector. #24427
Add a MySQL-compatible function
bit_length
that returns the count of bits for the given string. #24531Add support to build Presto with JDK 17. #24677
Add the ability to canonicalize JSON output through session property
canonicalized_json_extract
. #24614Add support for native ORC reader. #23037
Improve
task.max-drivers-per-task
by setting the default value to use thread concurrency of the host. #24642Fix a security bug when
check_access_control_for_utlized_columns
is true for queries that uses aWITH
clause. Previously we would sometimes not check permissions for certain columns that were used in the query. Now we will always check permissions for all columns used in the query. There are some corner cases for CTEs with the same name where we may check more columns than are used or fall back to checking all columns referenced in the query. #24647Fix Parquet read failing for nested Decimal types. #24440
Add manifest file caching for deployments which use the Hive metastore. #24481
Add table property
write.data.path
to specify independent data write paths for Iceberg tables. #24397Add support for Iceberg table sort orders. Tables can be created to add a list of sorted_by columns which will be used to order files written to the table. #21977
Add support for
UPDATE
SQL statements. #24281Add configuration property
tpcds.use-varchar-type
to allow toggling of char columns to varchar columns. #24406
Details¶
General Changes¶
Fix Hive
UUID
type parsing. #24538Fix addition, subtraction, multiplication and division of
INTERVAL YEAR MONTH
values. #24617Fix index error when a map column is passed into an unnest function by using the column analyzer to correctly map key and value output fields back to correct input expression. #24789
Fix silently returning incorrect results when trying to construct a TimestampWithTimeZone from a value that has a unix timestamp that is too large/small. #24674
Fix a potential block by making the number of task event loop configurable via a configuration file. #24565
Improve analysis of utilized columns in a query by exploring view definitions and checking the utilized columns of the underlying tables. #24638
Improve error handling of
INTERVAL DAY
,INTERVAL HOUR
, andINTERVAL SECOND
operators when experiencing overflows. #24353Improve scheduling by using long instead of DataSize for critical path. #24582
Improve scheduling by using long instead of DateTime for critical path. #24673
Improve presto router UI. #24411
Improve how multiple operator stats are merged together. #24414
Improve metrics creation by refactoring local variables to a dedicated class. #24414
Improve efficiency of coordinator when running a large number of tasks, controlled by
task.enable-event-loop
. #24668Add Troubleshooting topic to the Presto documentation. #24601
Add Arrow Flight connector. #24427
Add a MySQL-compatible function
bit_length
that returns the count of bits for the given string. #24531Add configuration property
exclude-invalid-worker-session-properties
. #23968Add documentation for file-based Hive metastore to Deploying Presto. #24620
Add documentation for the Arrow Flight Connector. #24427
Add pagesink for DELETES to support future use. #24528
Add serialization for new types. #24528
Add support to build Presto with JDK 17. #24677
Add a new optimizer rule to add exchanges below a combination of partial aggregation+ GroupId . Enabled with the boolean session property
enable_forced_exchange_below_group_id
. #24047Add module presto-native-tests to run end-to-end tests with Presto native workers. #24234
Add map of node ID to plan node to QueryCompletedEvent in the event listener interface. #24590
Add support for multiple query event listeners. #24456
Add
spark.dynamic-presto-memory-pool-tuning-enabled
configuration property to dynamically configure available Spark executor memory based on available container memory. #24714Add the ability to canonicalize JSON output through session property
canonicalized_json_extract
. #24614Add the ability for a file-based Hive metastore to use HDFS/S3 location as warehouse dir. #24660
Remove org.apache.logging.log4j:log4j-api from root POM. #24605
Remove org.apache.logging.log4j:log4j-core from root POM. #24605
Upgrade bootstrap to version 5. #24167
Upgrade jQuery to version 3.7.1. #24167
Prestissimo (Native Execution) Changes¶
Add a native type manager. #24179
Add support for Apache Arrow Flight connectors #24504
- Add Presto native shared arbitrator configuration properties:
shared-arbitrator.global-arbitration-abort-time-ratio
.shared-arbitrator.global-arbitration-memory-reclaim-pct
.shared-arbitrator.global-arbitration-without-spill
.shared-arbitrator.memory-pool-abort-capacity-limit
.shared-arbitrator.memory-pool-min-reclaim-bytes
.shared-arbitrator.memory-reclaim-threads-hw-multiplier
.
Add a type parameter for
ConnectorDeleteTableHandle
implementations toConnectorProtocolTemplate
, along with support for (de)serialization of connector-specific types. Existing native connector implementations definingConnectorProtocolTemplate
specializations must update their definitions to supply their specific type or useNotImplemented
. #24721Add
exchange.http-client.request-data-sizes-max-wait-sec
to native system configs. #24774Add
spill-enabled
,join-spill-enabled
,aggregation-spill-enabled
, andorder-by-spill-enabled
to native system configs. #24726Add new error code name
MEMORY_ARBITRATION_FAILURE
under error codeINSUFFICIENT_RESOURCE
. #24773Add a native function namespace manager. #23358
Add support for ORC reader. #23037
Add node pool type specification when reporting to the coordinator from a C++ worker. #24569
Improve
task.max-drivers-per-task
by setting the default value to use thread concurrency of the host. #24642
Security Changes¶
Fix a security bug when
check_access_control_for_utlized_columns
is true for queries that uses aWITH
clause. Previously we would sometimes not check permissions for certain columns that were used in the query. Now we will always check permissions for all columns used in the query. There are some corner cases for CTEs with the same name where we may check more columns than are used or fall back to checking all columns referenced in the query. #24647Remove reload4j dependency in response to WS-2022-0467. #24606
Replace deprecated
dagre-d3
withdagre-d3-es
in response to a high severity vulnerability WS-2022-0322. #24167Upgrade libthrift to 0.14.1 in response to CVE-2020-13949. #24462
Upgrade netty dependencies to version 4.1.115.Final in response to CVE-2024-47535. #24586
Upgrade prismJs to 1.30.0 in response to CVE-2024-53382. #24765
Upgrade the errorprone dependency from version 2.28.0 to 2.36.0. #24475
Upgrade the io.grpc library from version 1.68.0 to 1.70.0 in response to CVE-2024-7254, CVE-2020-8908. #24475
Upgrade org.apache.logging.log4j:log4j-api from 2.17.1 to 2.24.3 in response to CVE-2024-47554. #24507
Upgrade org.apache.logging.log4j:log4j-core from 2.17.1 to 2.24.3 in response to CVE-2024-47554. #24507
Upgrade commons-text to 1.13.0 in response to CVE-2024-47554. #24467
Upgrade okhttp to 4.12.0 in response to CVE-2023-3635. #24473
Upgrade okio to 3.6.0 in response to CVE-2023-3635. #24473
Upgrade org.apache.calcite to 1.38.0 in response to CVE-2023-2976. #24706
Upgrade org.apache.ratis to 3.1.3 in response to CVE-2020-15250. #24496
Upgrade aws-java-sdk version to 1.12.782 in response to CVE-2024-21634. #24606
Upgrade json-smart version to 2.5.2 in response to CVE-2024-57699. #24631
Upgrade the accumulo version to 1.10.1 in response to CVE-2020-17533. #24438
Upgrade the hive-dwrf version to 0.8.7 which involved upgrading snappy version to 0.5 in response to CVE-2024-36124. #24461
Elasticsearch Connector Changes¶
Improve cryptographic protocol in response to Weak SSL/TLS protocols should not be used. #24474
Hive Connector Changes¶
Fix Parquet read failing for nested Decimal types. #24440
Fix getting views for Hive metastore 2.3+. #24466
Add session property
hive.stats_based_filter_reorder_disabled
for disabling reader stats based filter reordering. #24630Replace return type of beginDelete. #24528
Rename session property
hive.stats_based_filter_reorder_disabled
tohive.native_stats_based_filter_reorder_disabled
. #24637Update native
HiveConnectorProtocol
to supplyNotImplemented
forConnectorDeleteTableHandle
type. #24721
Iceberg Connector Changes¶
Fix IcebergTableHandle implementation to work with new types used in begin/finishDelete. #24528
Fix bug with missing statistics when the statistics file cache has a partial miss. #24480
Fix Iceberg date column filtering. #24583
Add
read.split.target-size
table property. #24417Add
target_split_size_bytes
session property. #24417Add a dedicated subclass of
FileHiveMetastore
for the Iceberg connector to capture and isolate the differences in behavior. #24573Add connector configuration property
iceberg.catalog.hadoop.warehouse.datadir
for Hadoop catalog to specify root data write path for its new created tables. #24397Add logic to Iceberg type converter for timestamp with timezone. #23534
Add manifest file caching for deployments which use the Hive metastore. #24481
Add support for the
hive.affinity-scheduling-file-section-size
configuration property andaffinity_scheduling_file_section_size
session property. #24598Add support of
renaming table
for Iceberg connector when configured withHIVE
file catalog. #24312Add table property
write.data.path
to specify independent data write paths for Iceberg tables. #24397Add support for Iceberg table sort orders. Tables can be created to add a list of sorted_by columns which will be used to order files written to the table. #21977
Add support for
UPDATE
SQL statements. #24281Deprecate some table property names in favor of property names from the Iceberg library. See Iceberg Connector. #24581
Improve Iceberg queries by enabling manifest file caching by default. #24481
Update native
IcebergConnectorProtocol
to supplyNotImplemented
forConnectorDeleteTableHandle
type. #24721
Kudu Connector Changes¶
Replace return type of beginDelete. #24528
TPC-DS Connector Changes¶
Add configuration property
tpcds.use-varchar-type
to allow toggling of char columns to varchar columns. #24406
SPI Changes¶
Fix query failures by setting
REMOTE_BUFFER_CLOSE_FAILED
as a retriable error. #24808Add ConnectorSession as an argument to PlanChecker.validate and PlanChecker.validateFragment. #24557
Add DeleteTableHandle support for the ConnectorTableHandles changes in Metadata. #24528
Add
CoordinatorPlugin#getExpressionOptimizerFactories
to customize expression evaluation in the Presto coordinator. #24144Add a separate ConnectorDeleteTableHandle interface for
ConnectorMetadata.beginDelete
andConnectorMetadata.finishDelete
, replacing the previous usage of ConnectorTableHandle. #24528Add IndexSourceNode to the SPI. #24678
Update
beginDelete
to return new types, andfinishDelete
to accept new types inConnectorMetadata
. #24528
Credits¶
Abe Varghese, Amit Dutta, Anant Aneja, Andrii Rosa, Arjun Gupta, Artem Selishchev, Bryan Cutler, Chandrashekhar Kumar Singh, Christian Zentgraf, Deepak Majeti, Denodo Research Labs, Dilli-Babu-Godari, Elbin Pallimalil, Eric Liu, Gary Helmling, Ge Gao, HeidiHan0000, Jalpreet Singh Nanda, Jialiang Tan, Jiaqi Zhang, Joe Giardino, Ke, Kevin Tang, Kevin Wilfong, Krishna Pai, Li Zhou, Mahadevuni Naveen Kumar, Mariam Almesfer, Matt Karrmann, Minhan Cao, Natasha Sehgal, Nicholas Ormrod, Nidhin Varghese, Nikhil Collooru, Nivin C S, Patrick Sullivan, Pradeep Vaka, Pramod Satya, Prashant Sharma, Pratik Joseph Dabre, Rebecca Schlussel, Reetika Agrawal, Richard Barnes, Sagar Sumit, Sayari Mukherjee, Sergey Pershin, Shahad, Shahim Sharafudeen, Shakyan Kushwaha, Shang Ma, Shelton Cai, Steve Burnett, Swapnil, Timothy Meehan, Xiao Du, Xiaoxuan Meng, Yihong Wang, Ying, Yuanda (Yenda) Li, Zac Blanco, Zac Wen, aditi-pandit, ajay-kharat, auden-woolfson, dnskr, inf, jay.narale, librian415, namya28, shenh062326, sumi, vhsu14, wangd, wypb