Behavioral Changes in Hive

Behavioral changes denote a marked change in behavior from the previously released version to this version of Apache Hive.

Behavioral Changes in Cloudera Runtime 7.3.1.500 SP3

Summary:
Hive Metastore now uses dynamic leader election and automated housekeeping.
Previous behavior:
The Hive Metastore (HMS) previously used a static, host-based method for leader election, as the metastore.housekeeping.leader.election property defaulted to host. Additionally, critical housekeeping and compaction processes were turned off by default.
New behavior:
The HMS now uses a dynamic, lock-based method for leader election, as the metastore.housekeeping.leader.election property now defaults to lock. In addition, housekeeping and compaction processes are now enabled by default, ensuring that the Metastore automatically manages these critical operations in a single node across the warehouse.
Summary:
Increased batch sizes for COMPUTE STATS
Previous behavior:
The COMPUTE STATS query previously failed on tables containing more than 5000 columns. This issue was specific to wide tables and could not be resolved by dropping and rerunning the query.
New behavior:
To resolve this, we enable the batch retrieval or insertion of the object metadata by default value of the hive.metastore.direct.sql.batch.size property is changed from 0 to 1000, and the default value of the metastore.rawstore.batch.size property is changed from -1 to 500. After this change, COMPUTE STATS queries now run successfully on tables with more than 5000 columns.
Summary:
Removal of the engine.hive.enabled property
Previous behavior:
The engine.hive.enabled property was set to "true" to enable the creation of Hive-enabled Iceberg table.
New behavior:
The engine.hive.enabled property is removed because Hive is already supported and SQL engines are not required to explicitly specify this property.
Summary:
Change in the way dates are parsed from string by ignoring trailing invalid characters
Previous behavior:
Prior to this release, SQL functions or date operations involving invalid dates returned "null".
New behavior:
Now, a valid date is extracted and returned from a string value if there is a valid date prefix in the string. This change partially restores the behavior introduced in HIVE-20007 and makes the handling of trailing invalid characters more consistent.
The following table illustrates the behavior changes before and after the fix:
Strong value Behavior (before HIVE-20007) Previous behavior (after HIVE-20007) Current behavior (after HIVE-27586)
2023-08-03_16:02:00 2023-08-03 null 2023-08-03
2023-08-03-16:02:00 2023-08-03 null 2023-08-03
2023-08-0316:02:00 2024-06-11 null 2023-08-03
03-08-2023 0009-02-12 null 0003-08-20
2023-08-03 GARBAGE 2023-08-03 2023-08-03 2023-08-03
2023-08-03TGARBAGE 2023-08-03 2023-08-03 2023-08-03
2023-08-03_GARBAGE 2023-08-03 null 2023-08-03

This change affects various Hive SQL functions and operators that accept dates from string values, such as CAST (V AS DATE), CAST (V AS TIMESTAMP), TO_DATE, DATE_ADD, DATE_DIFF, WEEKOFYEAR, DAYOFWEEK, and TRUNC.

Summary:
Change in the way date and timestamp values are parsed.
Previous behavior:
Some Hive date and timestamp functions, such as unix_timestamp(), from_unixtime(), date_format(), and cast() used the SimpleDateFormat class for printing and parsing date and timestamp objects.
New behavior:
A new configurable hive.datetime.formatter property is introduced that enables you to choose between SimpleDateFormat and DateTimeFormatter for the unix_timestamp, from_unixtime, and date_format SQL functions. The default value is set to 'DATETIME'.
Summary:
Change in value for the hive.map.groupby.sorted property
Previous behavior:
The value of the hive.map.groupby.sorted property was set to 'true'.
New behavior:
The value of the hive.map.groupby.sorted property is changed to 'false' to disable optimization. The change was introduced to address data correctness issues noticed in query results on tables with CLUSTER BY and SORT BY.

Apache Jira: HIVE-27876

Summary:
Disabling join disjunctive predicate pushdown
Previous behavior:
With hive.optimize.join.disjunctive.transitive.predicates.pushdown enabled by default, queries with disjunctive predicates could cause HiveServer2 to crash or run out of memory during compilation.

Apache Jira: HIVE-28310

New behavior:
The hive.optimize.join.disjunctive.transitive.predicates.pushdown setting is now disabled by default, enhancing HiveServer2 stability and preventing crashes and out-of-memory errors. In some rare cases, queries with joins and unions become slightly less efficient but the difference should not be noticeable by the end-users.
Summary:
Hive CBO fallback strategy configuration
Previous behavior:
The hive.cbo.fallback.strategy property was set to CONSERVATIVE by default. In case of an error during the cost-based optimizer phase, Hive would fallback to the legacy optimizer, potentially reducing optimization efficiency and masking serious or unrecoverable errors.

Apache Jira: HIVE-27831

New behavior:
The default value for hive.cbo.fallback.strategy is now set to NEVER. Hive no longer falls back to the legacy optimizer and cost-based optimizer errors are fatal. Hidden compilation errors will now show up immediately and additional actions are required to compile and execute the query successfully.

Behavioral Changes in Cloudera Runtime 7.3.1.400 SP2

There are no behavioral changes in this release.

Behavioral Changes in Cloudera Runtime 7.3.1.300 SP1 CHF 1

There are no behavioral changes in this release.

Behavioral Changes in Cloudera Runtime 7.3.1.200 SP1

There are no behavioral changes in this release.

Behavioral Changes in Cloudera Runtime 7.3.1.100 CHF 1

There are no behavioral changes in this release.

Behavioral Changes in Cloudera Runtime 7.3.1

Summary:
Change in the way compaction initiator and cleaner threads are handled
Previous behavior:
The compaction initiator and cleaner threads are enabled and disabled by setting the hive.compactor.initiator.on property to 'true' or 'false'.
Apache Jira
A new property hive.compactor.cleaner.on is introduced that allows you to selectively enable or disable the cleaner thread.

This property is not listed and is set to 'false' by default. Add the property to Hive Metastore Server Advanced Configuration Snippet (Safety Valve) for hive-site.xml in Cloudera Manager to have the same out-of-the-box experience as in the previous version.

Also, ensure that you set the property to 'true' for the compactor to run on the HMS instance.