Release 0.294

Highlights

  • Improve query resource usage by enabling subfield pushdown for map_filter() when selected keys are constants. #25451

  • Improve query resource usage by enabling subfield pushdown for map_subset() when the input array is a constant array. #25394

  • Improve the efficiency of queries that involve with serialization operator by processing data in large group instead of one by one. #25569

  • Improve efficiency of queries with distinct aggregation and semi joins. #25238

  • Add changes to populate data source metadata to support combined lineage tracking. #25127

  • Add mixed case support for schema and table names. #24551

  • Add case-sensitive support for column names. It can be enabled for JDBC based connector by setting case-sensitive-name-matching=true at the catalog level. #24983

  • Update presto-plan-checker-router-plugin router plugin to use EXPLAIN (TYPE VALIDATE) in place of EXPLAIN (TYPE DISTRIBUTED), enabling faster routing of queries to either native or Java clusters. #25545

Details

General Changes

  • Fix filter pushdown to enable subfield pushdown for maps which are accessed with negative keys. #25445

  • Fix error classification for unsupported array comparison with null elements, converting it as a user error. #25187

  • Fix for UPDATE statements involving multiple identical target column values. #25599

  • Fix inconsistent ordering with offset and limit. #25216

  • Fix precision loss in parse_duration function for large millisecond values. #25538

  • Fix randomize null join optimizer to keep HBO information for join input. #25466

  • Improve and optimize Docker image layers. #25487

  • Improve efficiency of inserts on ORC files. #24913

  • Improve query resource usage by enabling subfield pushdown for map_filter() when selected keys are constants. #25451

  • Improve query resource usage by enabling subfield pushdown for map_subset() when the input array is a constant array. #25394

  • Improve semi join performance for large filtering tables. #25236

  • Improve efficiency of queries with distinct aggregation and semi joins. #25238

  • Improve performance of min_by/max_by aggregations. #25190

  • Add dot_product(array(real), array(real)) -> real() to calculate the sum of element wise product between two identically sized vectors represented as arrays. This function supports both array(real) and array(double) input types. For more information, refer to the Dot Product definition. #25508

  • Add broadcast_semi_join_for_delete session property to disable the ReplicateSemiJoinInDelete optimizer. #25256

  • Add history_based_optimizer_estimate_size_using_variables session property to have HBO estimate plan node output size using individual variables. #25400

  • Add changes to populate data source metadata to support combined lineage tracking. #25127

  • Add mixed case support for schema and table names. #24551

  • Add session property native_query_memory_reclaimer_priority which controls which queries are killed first when a worker is running low on memory. Higher value means lower priority to be consistent with Velox memory reclaimer’s convention. See Presto C++ Session Properties. #25325

  • Add xxhash64 override with seed argument. #25521

  • Add the l2_squared(array(real), array(real)) -> real() function to Java workers. #25409

  • Update QueryPlanner to only include the optional $row_id column in DELETE query output variables when it is actually used by the connector. #25284

  • Update the default value of check_access_control_on_utilized_columns_only session property to true. The false value makes the access check apply to all columns. See check_access_control_on_utilized_columns_only. #25469

Prestissimo (Native Execution) Changes

  • Fix Native Plan Checker for CTAS and Insert queries. #25115

  • Fix native session property manager reading plugin configs from file. #25553

  • Fix PrestoExchangeSource 400 Bad Request by adding the “Host” header. #25272

  • Improve memory usage in the PartitionAndSerialize operator and lower memory usage when serializing a sort key. #25393

  • Improve the efficiency of queries that involve with serialization operator by processing data in large groups instead of one by one. #25569

  • Add geometry type to the list of supported types in NativeTypeManager. #25560

  • Update stats API and Presto UI to report number of drivers and splits separately. #24671

Router Changes

  • Add the Presto Plan Checker Router Scheduler Plugin. #25035

  • Replace the parameters in router schedulers to use RouterRequestInfo to get the URL destination. #25244

  • Update presto-plan-checker-router-plugin router plugin to use EXPLAIN (TYPE VALIDATE) in place of EXPLAIN (TYPE DISTRIBUTED), enabling faster routing of queries to either native or Java clusters. #25545

  • Update router UI to eliminate vulnerabilities. #25206

Security Changes

JDBC Driver Changes

  • Fix issue introduced in #25127 by introducing TableLocationProvider interface to decouple table location logic from JDBC configuration. #25582

  • Improve type mapping API to add WriteMapping functionality. #25437

  • Add mixed case support related catalog property in JDBC connector case-sensitive-name-matching. #24551

  • Add case-sensitive support for column names. It can be enabled for JDBC based connector by setting case-sensitive-name-matching=true at the catalog level. #24983

Arrow Flight Connector Changes

Delta Lake Connector Changes

  • Improve mapping of TIMESTAMP column type by changing it from Presto TIMESTAMP type to TIMESTAMP_WITH_TIME_ZONE. #24418

  • Add support for TIMESTAMP_NTZ column type as Presto TIMESTAMP type. legacy_timestamp should be set to false to match delta type specifications. When set to false, TIMESTAMP will not adjust based on local timezone. #24418

Hive Connector Changes

  • Fix an issue while accessing symlink tables. #25307

  • Fix incorrectly ignoring computed table statistics in ANALYZE. #24973

  • Improve split generation and read throughput for symlink tables. #25277

  • Add support for symlink files in Quick Stats. #25250

  • Update default value of hive.copy-on-first-write-configuration-enabled to false. #25420

Iceberg Connector Changes

  • Fix error querying $data_sequence_number metadata column for table with equality deletes. #25293

  • Fix the Remove Orphan Files procedure after deletion operations. #25220

  • Add iceberg.delete-as-join-rewrite-max-delete-columns configuration property and delete_as_join_rewrite_max_delete_columns session property to control when equality delete as join optimization is applied. The optimization is now only applied when the number of equality delete columns is less than or equal to this threshold (default: 400). Set to 0 to disable the optimization. See Iceberg Connector. #25462

  • Add support for $delete_file_path metadata column. #25280

  • Add support for $deleted metadata column. #25280

  • Add support of rename view for Iceberg connector when configured with REST and NESSIE. #25202

  • Deprecate iceberg.delete-as-join-rewrite-enabled configuration property and delete_as_join_rewrite_enabled session property. Use iceberg.delete-as-join-rewrite-max-delete-columns instead. #25462

MySQL Connector Changes

  • Add support for mixed-case in MySQL. It can be enabled by setting case-sensitive-name-matching=true configuration in the catalog configuration. #24551

Redshift Connector Changes

  • Fix Redshift VARBYTE column handling for JDBC driver version 2.1.0.32+ by mapping jdbcType=1111 and jdbcTypeName="binary varying" to Presto’s VARBINARY type. #25488

  • Fix Redshift connector runtime failure due to a missing dependency on com.amazonaws.util.StringUtils. Add aws-java-sdk-core as a runtime dependency to support Redshift JDBC driver (v2.1.0.32) which relies on this class for metadata operations. #25265

SPI Changes

  • Add a function to SPI Constraint class to return the input arguments for the predicate. #25248

  • Add support for UnnestNode in connector optimizers. #25317

Documentation Changes

Credits

Amit Dutta, Anant Aneja, Andrew Xie, Andrii Rosa, Auden Woolfson, Beinan, Chandra Vankayalapati, Chandrashekhar Kumar Singh, Chen Yang, Christian Zentgraf, Deepak Majeti, Denodo Research Labs, Elbin Pallimalil, Emily (Xuetong) Sun, Facebook Community Bot, Feilong Liu, Gary Helmling, Hazmi, HeidiHan0000, Henry Edwin Dikeman, Jalpreet Singh Nanda (:imjalpreet), Joe Abraham, Ke Wang, Ke Wang, Kevin Tang, Li Zhou, Mahadevuni Naveen Kumar, Natasha Sehgal, Nidhin Varghese, Nikhil Collooru, Nishitha-Bhaskaran, Ping Liu, Pradeep Vaka, Pramod Satya, Pratik Joseph Dabre, Raaghav Ravishankar, Rebecca Schlussel, Reetika Agrawal, Sebastiano Peluso, Sergey Pershin, Sergii Druzkin, Shahim Sharafudeen, Shakyan Kushwaha, Shang Ma, Shelton Cai, Shrinidhi Joshi, Soumya Duriseti, Sreeni Viswanadha, Steve Burnett, Thanzeel Hassan, Tim Meehan, Vincent Crabtree, Wei He, XiaoDu, Xiaoxuan, Yihong Wang, Ying, Zac Blanco, Zac Wen, Zhichen Xu, Zhiying Liang, Zoltan Arnold Nagy, aditi-pandit, ajay kharat, duhow, github username, jay.narale, lingbin, martinsander00, mohsaka, namya28, pratyakshsharma, vhsu14, wangd