Release 0.217¶
General Changes¶
Fix poor query planning performance for queries submitted when the cluster does not have the minimum required number of workers available.
Fix a bug that caused a union of a table with a
VALUESstatement to execute on a single machine, which may result in out of memory errors.Improve performance of some queries that use window functions by eliminating redundant shuffles.
Add grouped execution support for window functions.
Add support for the
BOOLEANtype forEXPLAIN IOstatements.Add support for choosing distribution type for semi joins based on estimated cost when the session property
join_distribution_type='AUTOMATIC'is set.Add support for collecting table statistics on demand with the
ANALYZEstatement.Add a config option (
query.stage-count-warning-threshold) to specify a per-query threshold for the number of stages. When this threshold is exceeded, aTOO_MANY_STAGESwarning is raised.Add per-task peak user memory usage to query statistics.
Add CLI support for showing the amount of data spilled during query execution.
Add
ST_Points()function.Remove the system memory pool and related configuration properties (
resources.reserved-system-memory,deprecated.legacy-system-pool-enabled) entirely. System memory pool was deprecated in 0.201, and it was unused by default since that release. All memory allocations will now be served from the general/user memory pool.
Web UI¶
Improve live plan to show data transfer statistics as edge labels.
Improve live plan to contain more information about each node. For example, table scan nodes now show the name of the tables that are being scanned.
Add the ability to zoom when viewing plans for completed queries.
Add support for showing the amount of data spilled during query execution.
Hive Connector Changes¶
Fix an issue where a partially successful rollback of a write could cause data loss and corrupt the metastore. The
hive.skip-target-cleanup-on-rollbackconfiguration property can be used to skip deleting target directories when partition creation is rolled back.Fix an issue where creating a table on S3 could fail for S3 prefixes without any associated objects (e.g., empty S3 directories).
Add support for
ANALYZEstatement in the Hive connector. It’s possible to specify a list of partitions to collect statistics for using theWITHproperties of theANALYZEstatement.Add configuration property
hive.temporary-staging-directory-enabledand session propertytemporary_staging_directory_enabledto control whether a temporary staging directory should be used for write operations.Add configuration property
hive.temporary-staging-directory-pathand session propertytemporary_staging_directory_pathto control the location of temporary staging directory that is used for write operations. The${USER}placeholder can be used to use a different location for each user (e.g.,/tmp/${USER}).
JDBC Connector Changes¶
Add support for defining procedures.
Add support for providing table statistics.
SPI Changes¶
Add new SPIs for the
ANALYZEstatement. By default, running theANALYZEstatement for a connector that does not implement these SPIs results in an error o``USER_ERROR`` with error codeNOT_SUPPORTED.