Behavioral changes denote a marked change in behavior from the previously released
version to this version of Apache Hive.
Behavioral Changes in Cloudera Runtime 7.3.1.500 SP3
- Summary:
- Hive Metastore now uses dynamic leader election and automated
housekeeping.
- Previous behavior:
- The Hive Metastore (HMS) previously used a static, host-based method
for leader election, as the metastore.housekeeping.leader.election
property defaulted to host. Additionally, critical housekeeping and compaction processes were
turned off by default.
- New behavior:
- The HMS now uses a dynamic, lock-based method for leader election, as
the metastore.housekeeping.leader.election property now defaults to lock.
In addition, housekeeping and compaction processes are now enabled by default, ensuring that
the Metastore automatically manages these critical operations in a single node across the
warehouse.
- Summary:
- Increased batch sizes for
COMPUTE STATS
- Previous behavior:
- The
COMPUTE STATS query previously failed on tables
containing more than 5000 columns. This issue was specific to wide tables and could not be
resolved by dropping and rerunning the query.
- New behavior:
- To resolve this, we enable the batch retrieval or insertion of the
object metadata by default value of the
hive.metastore.direct.sql.batch.size property is changed from 0 to 1000,
and the default value of the metastore.rawstore.batch.size property is
changed from -1 to 500. After this change,
COMPUTE STATS queries now run
successfully on tables with more than 5000 columns.
- Summary:
- Removal of the
engine.hive.enabled property
- Previous behavior:
- The
engine.hive.enabled property was set to "true"
to enable the creation of Hive-enabled Iceberg table.
- New behavior:
- The
engine.hive.enabled property is removed because
Hive is already supported and SQL engines are not required to explicitly specify this
property.
- Summary:
- Change in the way dates are parsed from string by ignoring trailing
invalid characters
- Previous behavior:
- Prior to this release, SQL functions or date operations involving
invalid dates returned "null".
- New behavior:
- Now, a valid date is extracted and returned from a string value if
there is a valid date prefix in the string. This change partially restores the behavior
introduced in HIVE-20007 and makes the handling of trailing invalid characters more
consistent.
The following table illustrates the behavior changes before and after the
fix:
| Strong value |
Behavior (before HIVE-20007) |
Previous behavior (after HIVE-20007) |
Current behavior (after HIVE-27586) |
| 2023-08-03_16:02:00 |
2023-08-03 |
null |
2023-08-03 |
| 2023-08-03-16:02:00 |
2023-08-03 |
null |
2023-08-03 |
| 2023-08-0316:02:00 |
2024-06-11 |
null |
2023-08-03 |
| 03-08-2023 |
0009-02-12 |
null |
0003-08-20 |
| 2023-08-03 GARBAGE |
2023-08-03 |
2023-08-03 |
2023-08-03 |
| 2023-08-03TGARBAGE |
2023-08-03 |
2023-08-03 |
2023-08-03 |
| 2023-08-03_GARBAGE |
2023-08-03 |
null |
2023-08-03 |
This change affects various Hive SQL functions and operators that accept
dates from string values, such as CAST (V AS DATE), CAST (V AS TIMESTAMP), TO_DATE, DATE_ADD,
DATE_DIFF, WEEKOFYEAR, DAYOFWEEK, and TRUNC.
- Summary:
- Change in the way date and timestamp values are parsed.
- Previous behavior:
- Some Hive date and timestamp functions, such as
unix_timestamp(), from_unixtime(),
date_format(), and cast() used the
SimpleDateFormat class for printing and parsing date and timestamp
objects.
- New behavior:
- A new configurable
hive.datetime.formatter property
is introduced that enables you to choose between SimpleDateFormat and
DateTimeFormatter for the unix_timestamp,
from_unixtime, and date_format SQL functions. The default
value is set to 'DATETIME'.
- Summary:
- Change in value for the
hive.map.groupby.sorted property
- Previous behavior:
- The value of the
hive.map.groupby.sorted property was set to 'true'.
- New behavior:
- The value of the
hive.map.groupby.sorted property is changed to 'false' to disable optimization. The change was introduced to address data correctness issues noticed in query results on tables with CLUSTER BY and SORT BY.
Apache Jira: HIVE-27876
- Summary:
- Disabling join disjunctive predicate pushdown
- Previous behavior:
- With
hive.optimize.join.disjunctive.transitive.predicates.pushdown enabled by default, queries with disjunctive predicates could cause HiveServer2 to crash or run out of memory during compilation.
Apache Jira: HIVE-28310
- New behavior:
- The
hive.optimize.join.disjunctive.transitive.predicates.pushdown setting is now disabled by default, enhancing HiveServer2 stability and preventing crashes and out-of-memory errors. In some rare cases, queries with joins and unions become slightly less efficient but the difference should not be noticeable by the end-users.
- Summary:
- Hive CBO fallback strategy configuration
- Previous behavior:
- The
hive.cbo.fallback.strategy property was set to CONSERVATIVE by default. In case of an error during the cost-based optimizer phase, Hive would fallback to the legacy optimizer, potentially reducing optimization efficiency and masking serious or unrecoverable errors.
Apache Jira: HIVE-27831
- New behavior:
- The default value for
hive.cbo.fallback.strategy is now set to NEVER. Hive no longer falls back to the legacy optimizer and cost-based optimizer errors are fatal. Hidden compilation errors will now show up immediately and additional actions are required to compile and execute the query successfully.
Behavioral Changes in Cloudera Runtime 7.3.1.400 SP2
There are no behavioral changes in this release.
Behavioral Changes in Cloudera Runtime 7.3.1.300 SP1 CHF 1
There are no behavioral changes in this release.
Behavioral Changes in Cloudera Runtime 7.3.1.200 SP1
There are no behavioral changes in this release.
Behavioral Changes in Cloudera Runtime 7.3.1.100 CHF 1
There are no behavioral changes in this release.
Behavioral Changes in Cloudera Runtime 7.3.1
- Summary:
- Change in the way compaction initiator and cleaner threads are
handled
- Previous behavior:
- The compaction initiator and cleaner threads are enabled and
disabled by setting the
hive.compactor.initiator.on property to 'true' or
'false'.
- Apache Jira
- A new property
hive.compactor.cleaner.on is
introduced that allows you to selectively enable or disable the cleaner thread.This
property is not listed and is set to 'false' by default. Add the property to Hive Metastore
Server Advanced Configuration Snippet (Safety Valve) for hive-site.xml in Cloudera Manager to have the same
out-of-the-box experience as in the previous version.
Also, ensure that you set the
property to 'true' for the compactor to run on the HMS instance.