Known issues and limitations in Cloudera Data Warehouse Public Cloud

This section lists known issues and limitations that you might run into while using the Cloudera Data Warehouse (CDW) service in CDP Public Cloud.

Known Issues in AWS environments for Cloudera Data Warehouse service on public clouds

DWX-4577: For AWS environments only, if the region is "us-east-1" endpoints might return a 503 error
Problem: If the DHCP option set defines the domain name as domain-name=us-east-1.compute.internal and the endpoints (Grafana dashboards, DAS, Hue, JDBC, and so on) are returning 503 errors, then it is possible that the node name and the hostnames do not match.
Workaround:

Use the kubectl command line tool to get the node name and hostname being used. See Install and Set Up kubectl in the Kubernetes documentation.

Using kubectl run the following command:

kubectl get nodes -L "kubernetes.io/hostname"

This command returns the node name and host name. For example:

NAME                                            STATUS   ROLES             AGE     VERSION               HOSTNAME
ip-192-168-118-73.ec2.internal    Ready    shared-services   66d     v1.15.10-eks-bac369 ip-192-168-118-73.us-east-1.compute.internal

The above example shows that the node name domain is set to ec2.internal and the hostname domain is set to us-east-1.compute.internal. This can cause problems with starting the kube-proxy, which is running within the Kubernetes cluster and is responsible for forwarding traffic to the Ingress controller. To resolve this issue, use kubectl to change the externalTrafficPolicy of the nginx Ingress controller from Local to Cluster:

kubectl edit svc nginx-service -n cluster
      ..
      spec:
         clusterIP: 10.x.x.x
         externalTrafficPolicy: Local
      ..

Replace Local with Cluster, save it, and then quit the editor.

After making this change, re-open the endpoints to confirm that the issue is resolved.

DWX-2049: For the private networking feature on AWS environments. Only the default DHCP option set created when the VPC is created in AWS is supported.
Problem: When the VPC is created in AWS, a default DHCP option set is created. For this default DHCP option set the domain name option is set to <REGION>.compute.internal, where <REGION> is the AWS region where the VPC was created. The Data Warehouse service sets up an nginx ingress controller (the LoadBalancer service) with externalTrafficPolicy set to Local (externalTrafficPolicy=Local) for better performance because it means there is one less network hop.
Workaround: For this feature to work correctly, the domain name in the DHCP option set cannot be changed. Only the default DHCP option set is supported. Changing the domain name to a custom domain or having multiple domain names causes the kube-proxy to not start correctly. If the Kubernetes network proxy (kube-proxy) does not start correctly, the Amazon ELB (load balancer) does not have healthy targets. This causes workload endpoints, such as Data Analytics Studio (DAS), JDBC, or Hue, to return 503 errors. This is a known issue in Kubernetes and is yet to be fixed.

Known Issues in Azure environments for Cloudera Data Warehouse service on public clouds

DWX-4356: For Azure environments only, enabling Log Analytics with Azure Monitor causes CDW to stall in a pending state
Problem: If the Log Analytics agent (OMS-Agent-for-Linux) is enabled with Azure Monitor, it causes scheduling issues with CDW that in turn cause the Virtual Warehouse to stall in a pending state. For more information about Log Analytics in Azure, see the Azure documentation for Azure Monitor and the description of OMS-Agent-for-Linux in GitHub.
Workaround: Do not enable the Log Analytics agent. If it has been enabled by accident, turn off monitoring, which should remove the agent. For information about how to turn off monitoring, see How to stop monitoring your Azure Kubernetes Service (AKS) with Azure Monitor for containers in the Azure documentation.
DWX-3420: For Azure environments, the Show Kubeconfig option does not work
Problem: The more menu option used to display the kubeconfig file for Azure environments is grayed out and does not work:

Workaround: Use the following Azure CLI command:
az aks get-credentials -n <your-environment-ID>-dwx-rg-mc -g <your-environment-ID>-dwx-rg

Then use the Kubernetes kubectl config view command to view the kubeconfig file for your AKS cluster.

For example, if your environment id is env-hsdcq9, the following command fetches the cluster credentials:

az aks get-credentials -n env-hsdcq9-dwx-rg-mc -g env-hsdcq9-dwx-rg

For more information, see Get and verify the configuration information in the Microsoft Azure documentation.

Data Analytics Studio (DAS) in Cloudera Data Warehouse service on public clouds

DWX-4020: Add column functionality via upload table option doesn't work.
Problem: You may not be able to add or delete columns or change the table schema after creating a new table using the upload table feature.
Workaround: N/A
DWX-929: DAS UI displays the internal JDBC URL.
Problem: DAS displays the internal JDBC URL on its About page instead of the correct JDBC URL to use to connect to the data warehouse.
Workaround: To copy the correct JDBC URL to use to connect to the data warehouse, in the Data Warehouse service Overview page, go to Virtual Warehouse > , and then click Copy JDBC URL.
DWX-2592: DAS cannot parse certain characters in strings and comments.
Problem: DAS cannot parse semicolons (;) and double hyphens (--) in strings and comments. For example if you have a semicolon in a query such as the following, the query might fail:

SELECT * FROM properties WHERE prop_value = "name1;name2";
           
Queries with double hyphens (--) might also fail. For example:

SELECT * FROM test WHERE option = '--name';
             
Workaround: If a semicolon is present in a comment, then remove the semicolon before running the query or remove the comment entirely. For example:

SELECT * FROM test; -- SELECT * FROM test;
             
Should be changed to:

SELECT * FROM test; /* comment; comment */
             
In the same manner, remove any double-hyphens before running queries to avoid failure in DAS.
Older versions of Google Chrome browser might cause issues.
Problem: You might experience problems while using faceted search in older versions of the Google Chrome browser.
Workaround: Use the latest version (71.x or later) of Google Chrome.
BUG-94611: Visual Explain for the same query shows different graphs.
Problem: Visual Explain for the same query shows different graphs on the Compose page and the Query Details page.
Workaround: N/A

Database Catalog on public clouds

There are no known issues.

Hive Virtual Warehouses on public clouds

Result caching:
This feature is limited to 10 GB.
Data caching:
This feature is limited to 200 GB per compute node, multiplied by the total number of compute nodes.
DWX-3443: ANALYZE TABLE…COMPUTE STATISTICS fails with NullPointerException on Virtual Warehouse version 7.1.1.0-236
Problem: The ANALYZE TABLE…COMPUTE STATISTICS statement is run to gather statistics on a table for writing to the metastore. For example:
ANALYZE TABLE <table_name> PARTITION(<partition_name> COMPUTE STATISTICS;
However, if you run this statement against a table in a Hive Virtual Warehouse version 7.1.1.0-236, a NullPointerException (NPE) might be returned.

To determine the version of the Virtual Warehouse:

  1. In the Data Warehouse service UI, select Virtual Warehouses in the left navigation menu.
  2. On the Virtual Warehouses page, locate the Virtual Warehouse that is returning the error, and click on its Name.
  3. On the details page for the Virtual Warehouse, the version is listed at the top:

Workaround: Upgrade to a later version of Cloudera Runtime for the Virtual Warehouse:

  1. In the Data Warehouse service UI, select Overview in the left navigation menu.
  2. In the Overview page, click More… in the Environments column to expand it and search for the environment that is being used for the Virtual Warehouse which is returning the error:

  3. After you locate the environment, click the delete icon in the upper right corner of the environment tile:

    Clicking this icon launches the Action dialog box, but it does not delete the environment.

  4. In the Action dialog box, click OK:

    Clicking OK in the Action dialog box de-activates the environment.

  5. After the environment has been de-activated, an activation icon appears on the tile. Click the activation icon to re-activate the environment:

    When you re-activate the environment, it automatically refreshes the Cloudera Runtime version for the Virtual Warehouse and you should no longer get the NPE error.

DWX-2690: Older versions of Beeline return SSLPeerUnverifiedException when submitting a query

Problem: When submitting queries to Virtual Warehouses that use Hive, older Beeline clients return an SSLPeerUnverifiedException error:

javax.net.ssl.SSLPeerUnverifiedException: Host name ‘ec2-18-219-32-183.us-east-2.compute.amazonaws.com’ does not
match the certificate subject provided by the peer (CN=*.env-c25dsw.dwx.cloudera.site) (state=08S01,code=0)

Workaround: Only use Beeline clients from CDP Runtime version 7.0.1.0 or later.

DWX-1952: Cloned Hive Virtual Warehouses do not have query executors or query coordinators
Problem: When you clone an existing Hive Virtual Warehouse, it is created with only HiveServer and Data Analytics Studio (DAS) application container groups (Kubernetes pods). This means that the cloned Virtual Warehouse cannot execute queries.
Workaround:

To manually add query executors and query coordinators to the cloned Hive Virtual Warehouse:

  1. Click the options menu on the cloned Virtual Warehouse, and then select Edit:



  2. In the Virtual Warehouse edit page, change a value, such as the AutoSuspend Timeout setting, and then click Apply:



    This causes the Data Warehouse service to create query executors and query coordinators so you can execute queries on the cloned Virtual Warehouse.

Impala Virtual Warehouses on public clouds

DWX-3914: Collect Diagnostic Bundle option does not work on older environments
The Collect Diagnostic Bundle menu option in Impala Virtual Warehouses does not work for older environments:

Data caching:
This feature is limited to 200 GB per compute node, multiplied by the total number of compute nodes.
Sessions with Impala continue to run for 15 minutes after the connection is disconnected.
When a connection to Impala is disconnected, the session continues to run for 15 minutes in case so the user or client can reconnect to the same session again by presenting the session_token. After 15 minutes, the client must re-authenticate to Impala to establish a new connection.