Known Issues in Apache Impala
Learn about the known issues in Impala, the impact or changes to the functionality, and the workaround.
Known Issues identified in Cloudera Runtime 7.3.1.500 SP3:
- DWX-20490: Impala queries fail with "Caught exception The read operation timed out, type=<class 'socket.timeout'> in ExecuteStatement"
- 7.3.1.500
- DWX-20491: Impala queries fail with EOFException: End of file reached before reading fully
- 7.3.1.500
- CDPD-90807: Thrift protocol limitation during Impala zero downtime upgrade (ZDU)
- 7.3.1.500
Known Issues identified in Cloudera Runtime 7.3.1.400 SP2:
There are no new known issues identified for Impala in this release.
Known Issues identified in Cloudera Runtime 7.3.1.300 SP1 CHF1
There are no new known issues identified for Impala in this release.
Known Issues identified in Cloudera Runtime 7.3.1.200 SP1
- CDPD-80166: Ignore CREATE_TABLE events for inaccessible databases to prevent event processor error
- 7.3.1.200
Known Issues identified in Cloudera Runtime 7.3.1.100 CHF1
There are no new known issues identified for Impala in this release.
Known Issues identified in Cloudera Runtime 7.3.1
- IMPALA-532: Impala should tolerate bad locale settings
- 7.3.1 and its higher versions
- IMPALA-691: Process mem limit does not account for the JVM's memory usage
- 7.3.1 and its higher versions
- IMPALA-635: Avro Scanner fails to parse some schemas
- 7.3.1 and its higher versions
- IMPALA-1024: Impala BE cannot parse Avro schema that contains a trailing semi-colon
- 7.3.1 and its higher versions
- IMPALA-1652: Incorrect results with basic predicate on CHAR typed column
- 7.3.1 and its higher versions
- IMPALA-1821: Casting scenarios with invalid/inconsistent results
- 7.3.1 and its higher versions
- IMPALA-2005: A failed CTAS does not drop the table if the insert fails
- 7.3.1 and its higher versions
- IMPALA-3509: Breakpad minidumps can be very large when the thread count is high
- 7.3.1 and its higher versions
- IMPALA-4978: Impala requires FQDN from hostname command on Kerberized clusters
- 7.3.1 and its higher versions
- IMPALA-6671: Metadata operations block read-only operations on unrelated tables
- 7.3.1 and its higher versions
- IMPALA-7072: Impala does not support Heimdal Kerberos
- None
- CDPD-28139: Set spark.hadoop.hive.stats.autogather to false by default
- As an Impala user, if you submit a query against a table containing data ingested using Spark and you are concerned about the quality of the query plan, you must run COMPUTE STATS against such a table in any case after an ETL operation because numRows created by Spark could be incorrect. Also, use other stats computed by COMPUTE STATS, e.g., Number of Distinct Values (NDV) and NULL count for good selectivity estimates.
- IMPALA-2422: % escaping does not work correctly when occurs at the end in a LIKE clause
- 7.3.1 and its higher versions
- IMPALA-2603: Crash: impala::Coordinator::ValidateCollectionSlots
- A query could encounter a serious error if includes multiple
nested levels of
INNER JOINclauses involving subqueries. - IMPALA-3094: Incorrect result due to constant evaluation in query with outer join
- 7.3.1 and its higher versions
- CDPD-60862: Rolling restart fails during ZDU when DDL operations are in progress
-
During a Zero Downtime Upgrade (ZDU), the rolling restart of services that support Data Definition Language (DDL) statements might fail if DDL operations are in progress during the upgrade. As a result, ensure that you do not run DDL statements during ZDU.
The following services support DDL statements:- Impala
- Hive – using HiveQL
- Spark – using SparkSQL
- HBase
- Phoenix
- Kafka
Data Manipulation Lanaguage (DML) statements are not impacted and can be used during ZDU. Following the successful upgrade, you can resume running DDL statements.
- CDPD-59625: Impala shell in RHEL 9 with Python 2 as default does not work
- 7.1.9, 7.3.1 and its higher version
- CDPD-42958: After upgrading the CDH 7.1.9 from CDH 6.x, under certain conditions you cannot insert data into a table
- 7.1.8, 7.1.9, 7.3.1 and its higher version
- Impala cannot update table if the 'external.table.purge' property is not set to true
-
Impala cannot update a table using DDL statements if the 'external.table.purge' property is FALSE. ALTER TABLE statements return success with no changes to the table.
- Impala's known limitation when querying compacted tables
- When the compaction process deletes the files for a table from the underlying HDFS
location, the Impala service does not detect the changes as the compactions does not
allocate new write ids. When the same table is queried from Impala it throws a 'File
does not exist' exception that looks something like
this:
Query Status: Disk I/O error on <node>:22000: Failed to open HDFS file hdfs://nameservice1/warehouse/tablespace/managed/hive/<database>/<table>/xxxxx Error(2): No such file or directory Root cause: RemoteException: File does not exist: /warehouse/tablespace/managed/hive/<database>/<table>/xxxx - Impala api calls via knox require configuration if the knox customized kerberos principal name is a default service user name
- To access impala api calls via knox, if the knox customized kerberos principal name is a default service user name, then configure "authorized_proxy_user_config" by clicking Clusters->impala->configuration. Include the knox customized kerberos principal name in the comma separated list of values <knox_custom_kerberos_principal_name>=*" where <knox_custom_kerberos_principal_name> is the value of the Kerberos Principal in the Knox service. Select Clusters>Knox>Configuration and search for Kerberos Principal to display this value.
- CDPD-28431: Intermittent errors could be potentially encountered when Impala UI is accessed from multiple Knox nodes.
- 7.1.7
- CDPD-21828: Multiple permission assignment through grant is not working
- 7.1.7
- Problem configuring masking on tables using Ranger
- The following Knowledge Base article describes the behavior when we configure masking on tables using Ranger. This configuration works for Hive, but breaks queries in some scenarios for Impala.
- IMPALA-11871: INSERT statement does not respect Ranger policies for HDFS
- 7.3.1, 7.3.1.300 and its higher version
- OPSAPS-46641: A single parameter exists in Cloudera Manager for specifying the Impala Daemon Load Balancer. Because BDR and Hue need to use different ports when connecting to the load balancer, it is not possible to configure the load balancer value so that BDR and Hue will work correctly in the same cluster.
- The workaround is to use the load balancer configuration either without a port specification, or with the Beeswax port: this will configure BDR. To configure Hue use the "Hue Server Advanced Configuration Snippet (Safety Valve) for impalad_flags" to specify the the load balancer address with the HiveServer2 port.
