Release 0.287¶
Highlights¶
Details¶
General Changes¶
Fix a bug in CTE reference node creation, where different CTEs may be incorrectly considered as the same CTE. #22515
Fix an issue with heuristic CTE materialization strategy where incorrect CTEs were materialized. #22433
Fix plan canonicalization by canonicalizing plan node ids. #22033
Fix bug with spilling in TopNRowNumber. #22281
Fix problem when writing large varchar values to throw a user error when it exceeds internal limits. #22063
Fix queries that filter with
LIKE '%...%'
over char columns. #22076Fix the regr_count, regr_avgx, regr_avgy, regr_syy, regr_sxx, and regr_sxy functions result to be null when the input data is null, not 0. #22112
Fix precision loss when timestamp yielded from
from_unixtime(double)
function. #21899Fix CAST(str as INTEGER), CAST(str as BIGINT), CAST(str as SMALLINT), CAST(str as TINYINT) to allow leading and trailing spaces in the string. #22284
Improve latency of materialized CTEs by scheduling multiple dependent subgraphs independently. #22205
Improve propagation of logical properties by enabling it by default. #22266
Improve accuracy and performance of HyperLogLog functions. #21943
Improve repeat function to create RunLengthEncodedBlock to improve performance. #21984
Add Heuristic CTE Materialization strategy which auto materialized expensive CTEs. This is configurable by setting
cte_materialization_strategy
toHEURISTIC
orHEURISTIC_COMPLEX_QUERIES_ONLY
. (defaultNONE
). #21720Add a session property
track_history_based_plan_statistics_from_complete_stages_in_failed_query
to enable tracking hbo statistics from complete stages in failed queries. #20947Add session property
history_optimization_plan_canonicalize_strategy
to specify the plan canonicalization strategies to use for HBO. #21832Add worker type and query ID information in HBO stats. #22234
Add log of stats equivalent plan and canonicalized plan for HBO. This feature is controlled by session property
log_query_plans_used_in_history_based_optimizer
. #22306Add limit to the amount of data written during CTE Materialization. This is configurable by the session property
query_max_written_intermediate_bytes
(default is 2TB). #22017Add a new plan canonicalization strategy
ignore_scan_constants
which canonicalizes predicates for both partitioned and non-partitioned columns in scan node. #21832Add an optimizer rule to get rid of map cast in map access functions when possible. #22059
Add histogram column statistic to Presto for the optimizer. Connectors can now implement support for them. #21236
Add Quick stats, a mechanism to build stats from metadata for tables and partitions that are missing stats. #21436
Add DDL support for Table constraints (primary key and unique constraints). #20384
Add optimization for query plans which contain RowNumber and TopNRowNumber nodes with empty input. #21914
Add support for Apache DataSketches KLL sketch with the
sketch_kll
and related family of functions. #21568Add support for
map_key_exists
builtin SQL UDF. #21966Add configuration property
legacy_json_cast
whose default value istrue
. See Legacy Compatible Properties. #21869Add support for tracking of the input data size when there is a fragment result cache hit. This can be enabled by setting the configuration property
fragment-result-cache.input-data-stats-enabled=true
. #22145Add JSON as a supported output format in the Presto CLI. #22181
Add documentation for supported data Type mapping in the Iceberg connector. #22093
Add usage documentation for Command Line Interface. #22265
Add usage documentation for Presto Console. #22349
Improve
map_normalize
builtin SQL UDF to avoid repeated reduce computation. #22211Remove
native_execution_enabled
,native_execution_executable_path
andnative_execution_program_arguments
session properties. Corresponding configuration properties are still available. #22183Remove the configuration property
use-legacy-scheduler
and the corresponding session propertyuse_legacy_scheduler
. The property previously defaulted to true, and the new scheduler, which was intended to replace it eventually, was never productionized and is no longer needed. The configuration propertymax-stage-retries
and the session propertymax_stage_retries
have also been removed. #21952Upgrade Alluxio to 312. #22452
Security Changes¶
Remove logback 1.2.3. #21819
Add session property
default-view-security-mode
to choose the default security mode for view creation. #21956
Verifier Changes¶
Add support for extended bucket verification of INSERT and CTAS queries. This can be enabled by the configuration property
extended-verification
to verify each bucket’s data checksum if the inserted table is bucketed. #22001Add support for extended partition verification of INSERT and CTAS queries. This can be enabled by the configuration property
extended-verification
to verify each partition’s data checksum if the inserted table is partitioned. #21983
SPI Changes¶
Add replaceColumn method to com.facebook.common.Page. #22493
Remove SPI method ConnectorMetadata.getTableLayouts() as deprecated. Add ConnectorMetadata.getTableLayoutForConstraint() as replacement. #21933
Move SortNode to SPI module to be utilized in connector. #22497
Hive Connector Changes¶
Fix a potential wrong results bug when footer stats are marked unreliable and partial aggregation pushdown is enabled. Such queries will now fail with an error. #22011
Improve the
hive.orc.use-column-names
configuration setting to no longer fail on reading ORC files without column names, but fall back to using Hive’s schema. This change improves compatibility with legacy ORC files. #21391Add session property
hive.dynamic_split_sizes_enabled
to use dynamic split sizes based on data selected by query. #22051Add support for Filelist caching for symlink tables. #19145
Add $row_id as a new hidden column. #22008
Add system procedure
system.invalidate_directory_list_cache()
to invalidate directory list cache in Hive Catalog. #19821
Iceberg Connector Changes¶
Upgrade Iceberg from 1.4.3 to 1.5.0. #21961
Fix identity and truncate transforms on DecimalType columns. #21958
Fix the bug that
CAST
from non-legacy timestamp to date rounding to future when the timestamp is prior than 1970-01-01 00:00:00.000. #21959Add support to set
commit.retry.num-retries
table property with table creation to make the number of attempts to make in case of concurrent upserts configurable. #21250Add year/month/day/hour transforms both on legacy and non-legacy TimestampType column. #21959
Fix error encountered when attempting to execute an
INSERT INTO
statement where column names contain white spaces. #21827Add support for row-level deletes on Iceberg V2 tables. The delete mode can be changed from
merge-on-read
tocopy-on-write
by setting table propertydelete_mode
. #21571Add support for Iceberg V1 tables in Prestissimo. #21584
Add support to read Iceberg V2 tables with Position Deletes in Prestissimo. #21980
Add support for Iceberg concurrent insertions. #21250
MySQL Connector Changes¶
Add support for timestamp column type. #21937
Credits¶
8dukongjian, Ajay George, Amit Dutta, Anant Aneja, Andrii Rosa, Athmaja N, Avinash Jain, Bikramjeet Vig, Christian Zentgraf, Deepa George, Deepak Majeti, Eduard Tudenhoefner, Elliotte Rusty Harold, Emanuel F, Fazal Majid, Jalpreet Singh Nanda (:imjalpreet), Jialiang Tan, Jimmy Lu, Jonathan Hehir, Karteekmurthys, Ke, Kevin Wilfong, Konjac Huang, Lyublena Antova, Masha Basmanova, Mohan Dhar, Nikhil Collooru, Pranjal Shankhdhar, Pratik Joseph Dabre, Rebecca Schlussel, Reetika Agrawal, Rohit Jain, Sanika Babtiwale, Sergey Pershin, Sergii Druzkin, Sreeni Viswanadha, Steve Burnett, Sudheesh, Swapnil Tailor, Tai Le Manh, Timothy Meehan, Todd Gao, Vivek, Will, Yihong Wang, Ying, Zac Blanco, Zac Wen, Zhenxiao Luo, aditi-pandit, dnskr, feilong-liu, hainenber, ico01, jaystarshot, kedia,Akanksha, kiersten-stokes, polaris6, pratyakshsharma, s-akhtar-baig, sabbasani, wangd, wypb, xiaodou, xiaoxmeng