Known Issues in Apache Hive
This topic describes known issues and workarounds for using Hive in this release of Cloudera Runtime.
- OPSAPS-54299 Installing Hive on Tez and HMS in the incorrect order causes HiveServer failure
- You need to install Hive on Tez and HMS in the correct order; otherwise, HiveServer fails. You need to install additional HiveServer roles to Hive on Tez, not the Hive service; otherwise, HiveServer fails.
- CDPD-23041: DROP TABLE on a table having an index does not work
- If you migrate a Hive table to CDP having an index, DROP TABLE
does not drop the table. Hive no longer supports indexes (HIVE-18448). A foreign key constraint on the indexed table prevents dropping the
table. Attempting to drop such a table results in the following error:
java.sql.BatchUpdateException: Cannot delete or update a parent row: a foreign key constraint fails ("hive"."IDXS", CONSTRAINT "IDXS_FK1" FOREIGN KEY ("ORIG_TBL_ID") REFERENCES "TBLS ("TBL_ID"))
- CDPD-12301: Spark Hive Streaming using HWC fails
- Spark Hive Streaming using HWC can fail throwing a java.lang.NoSuchMethodException message.
- CDPD-6030: External table with custom TDE zone cannot be accessed.
- Workaround: Use default location for external tables and use TDE for default location.
- CDPD-5962: A schema evolution that replaces a date type with a string type fails.
- Workaround: none
Technical Service Bulletins
- TSB 2021-480/1: Hive produces incorrect query results when skipping a header in a binary file
- In CDP, setting the table property
skip.header.line.count
to greater than 0 in a table stored in a binary format, such as Parquet, can cause incorrect query results. The skip header property is intended for use with Text files and typically used with CSV files. The issue is not present when you run the query on a Text file that sets the skip header property to 1 or greater. - Upstream JIRA
- Apache Jira: HIVE-24827
- Knowledge article
- For the latest update on this issue see the corresponding Knowledge article: TSB-2021 480.1: Hive produces incorrect query results when skipping a header in a binary file
- TSB 2021-480/2: Hive ignores the property to skip a header or footer in a compressed file
- In CDP, setting the table properties
skip.header.line.count
andskip.footer.line.count
to greater than 0 in a table stored in a compressed format, such as bzip2, can cause incorrect results from SELECT * or SELECT COUNT ( * ) queries. - Upstream JIRA
- Apache Jira: HIVE-24224
- Knowledge article
- For the latest update on this issue see the corresponding Knowledge article: TSB-2021 480.2: Hive ignores the property to skip a header or footer in a compressed file
- TSB 2021-482: Race condition in subdirectory delete/rename causes hive jobs to fail
- Multiple threads try to perform a rename operation on s3. One of the threads fails to perform a rename operation, causing an error. Hive logs will report "HiveException: Error moving ..." and the log will contain an error line starting with " Exception when loading partition " -all paths listed with s3a:// prefixes.
- Knowledge article
- For the latest update on this issue see the corresponding Knowledge article: TSB 2021 - 482: Race condition in subdirectory delete/rename causes Hive jobs to fail
- TSB 2021-501: JOIN queries return wrong result for join keys with large size in Hive
- JOIN queries return wrong results when performing joins on large size keys (larger than 255 bytes). This happens when the fast hash table join algorithm is enabled, which is enabled by default.
- Knowledge article
- For the latest update on this issue see the corresponding Knowledge article: TSB 2021-501: JOIN queries return wrong result for join keys with large size in Hive
- TSB 2021-518: Incorrect results returned when joining two tables with different bucketing versions
- Incorrect results are returned when joining two tables with different bucketing versions, and with the following Hive configurations: set hive.auto.convert.join = false and set mapreduce.job.reduces = any custom value.
- Knowledge article
- For the latest update on this issue see the corresponding Knowledge article: TSB 2021-518: Incorrect results returned when joining two tables with different bucketing versions
- TSB 2022-526: A Hive query may produce wrong results for some vectorized built-in functions with compound expression in PARTITION BY or ORDER BY clause
- Vectorized functions with PARTITION BY and/or ORDER BY clauses where the partition or
order by expression is compound (example: cast string to integer) and not just a simple
column reference may be broken.The query may fail or output wrong results, depending on the compound expression. For example:
- Cast integer to string results in query failure with a NullPointerExpression
- Cast string to integer outputs wrong results
- Knowledge article
- For the latest update on this issue see the corresponding Knowledge article: TSB 2022-526: A Hive query may produce wrong results for some vectorized built-in functions with compound expression in PARTITION BY or ORDER BY clause
- TSB 2023-627: IN/OR predicate on binary column returns wrong result
- An IN or an OR predicate involving a binary datatype column may
produce wrong results. The OR predicate is converted to an IN due to the setting
hive.optimize.point.lookup
which is true by default. Only binary data types are affected by this issue. See https://issues.apache.org/jira/browse/HIVE-26235 for example queries which may be affected. - Upstream JIRA
- HIVE-26235
- Knowledge article
- For the latest update on this issue, see the corresponding Knowledge article: TSB 2023-627: IN/OR predicate on binary column returns wrong result