Known issues in Impala Virtual Warehouses on public clouds

Learn about the known issues related to Impala Virtual Warehouse in Cloudera Data Warehouse service on public clouds, the impact or changes to the functionality, and the workaround.

IMPALA-11039 Impala Virtual Warehouse instability
Problem: Executors can crash during the Parquet scans
Workaround: Disable Parquet late materialization in CDW version 2021.0.6-b96. To the default_query_options startup option, add the following property setting:
PARQUET_LATE_MATERIALIZATION_THRESHOLD=-1
Upgrade to the next CDW version when available.
CDPD-38204 Top-N query failure
Problem: A Top-N operator crashes in the Impala Virtual Warehouse
Workaround: Set the analytic rank pushdown threshold to zero.To the default_query_options startup option, add the following property:
ANALYTIC_RANK_PUSHDOWN_THRESHOLD=0 
Upgrade to the next CDW version when available.
IMPALA-11045 Impala Virtual Warehouses might produce an error when querying transactional (ACID) table even after you enabled the automatic metadata refresh (version DWX 1.1.2-b2008)
Problem: Impala doesn't open a transaction for select queries, so you might get a FileNotFound error after compaction even though you refreshed the metadata automatically.
Workaround: Run the INVALIDATE METADATA statement on the transactional (ACID) table to refresh the metadata. This fixes the problem until the next compaction occurs. For information about running this statement, see INVALIDATE METADATA statement.
Impala Virtual Warehouses might produce an error when querying transactional (ACID) tables (DWX 1.1.2-b1949 or earlier)
Problem: If you are querying transactional (ACID) tables with an Impala Virtual Warehouse and compaction is run on the compacting Hive Virtual Warehouse, the query might fail. The compacting process deletes files and the Impala Virtual Warehouse might not be aware of the deletion. Then when the Impala Virtual Warehouse attempts to read the deleted file, an error can occur. This situation occurs randomly.
Workaround: Run the INVALIDATE METADATA statement on the transactional (ACID) table to refresh the metadata. This fixes the problem until the next compaction occurs. For information about running this statement, see INVALIDATE METADATA statement.
Do not use the start/stop icons in Impala Virtual Warehouses version 7.2.2.0-106 or earlier
Problem: If you use the stop/start icons in Impala Virtual Warehouses version 7.2.2.0-106 or earlier, it might render the Virtual Warehouse unusable and make it necessary for you to re-create it.
Workaround: Do not use the stop/start icons in these older Virtual Warehouses. Instead, these older versions automatically suspend and resume the Impala executors depending on the absence or presence of queries, making manual start or stop unnecessary.
DWX-6674: Hue connection fails on cloned Impala Virtual Warehouses after upgrading
Problem: If you clone an Impala Virtual Warehouse from a recently upgraded Impala Virtual Warehouse, and then try to connect to Hue, the connection fails.
Workaround: Create a new Impala Virtual Warehouse and do not clone from a recently upgraded warehouse. Then the connection to Hue from the new Impala Virtual Warehouse succeeds.
DWX-5841: Virtual Warehouse endpoints are now restricted to TLS 1.2
Problem: TLS 1.0 and 1.1 are no longer considered secure, so now Virtual Warehouse endpoints must be secured with TLS 1.2 or later, and then the environment that the Virtual Warehouse uses must be reactivated in CDW. This includes both Hive and Impala Virtual Warehouses. To reactivate the environment in the CDW UI:
  1. Deactivate the environment. See Deactivating AWS environments or Deactivating Azure environments.
  2. Activate the environment. See Activating AWS environments or Activating Azure environments
Workaround: If environment reactivation is not possible, you can perform manual steps using the kubectl command line tool to pick up the TLS 1.2 endpoint change. Open a terminal window on a system where the kubectl command line tool is installed, log in, and run the following commands:
kubectl edit svc nginx-service -n <cluster-name>

# Add the following under the metadata.annotations field
service.beta.kubernetes.io/aws-load-balancer-ssl-negotiation-policy: "ELBSecurityPolicy-TLS-1-2-2017-01"
      
# Save and quit the editor, and then run the following command to check your changes.
      
kubectl get svc nginx-service -n <cluster-name> -o yaml
      
# Make sure that the annotation you added is present.
DWX-5276: Upgrading an older version of an Impala Virtual Warehouse can result in error state
Problem: If you upgrade an older version of an Impala Virtual Warehouse (DWX 1.1.1.1-4) to the latest version, the Virtual Warehouse can get into an Updating or Error state.
Workaround: none
DWX-3914: Collect Diagnostic Bundle option does not work on older environments
The Collect Diagnostic Bundle menu option in Impala Virtual Warehouses does not work for older environments:


Data caching:
This feature is limited to 200 GB per executor, multiplied by the total number of executors.
Sessions with Impala continue to run for 15 minutes after the connection is disconnected.
When a connection to Impala is disconnected, the session continues to run for 15 minutes in case so the user or client can reconnect to the same session again by presenting the session_token. After 15 minutes, the client must re-authenticate to Impala to establish a new connection.