Impala queries fail

Condition

Impala queries running with high concurrency fail on Embedded Container Service (ECS) with the following errors: Invalid or unknown query handle and Invalid session id.

Cause

Impala queries might fail because a single ECS server may not be able to handle the load. To resolve this issue, enable ECS High Availability and increase the ECS server replicas. This process is called promoting the ECS agents to servers. You must promote only one ECS agent at a time. This procedure is explained using an example where you promote the ECS agent on agent1.example.com and then promote the ECS agent on agent2.example.com.

Solution

  1. Prepare the agent node for promotion by running the following commands on the command line of your ECS server host.
    sudo /var/lib/rancher/rke2/bin/kubectl --kubeconfig=/etc/rancher/rke2/rke2.yaml get nodes
    sudo /var/lib/rancher/rke2/bin/kubectl --kubeconfig=/etc/rancher/rke2/rke2.yaml drain agent1.example.com --ignore-daemonsets --delete-emptydir-data
  2. In Cloudera Manager, navigate to ECS Cluster > ECS. Stop the ECS Agent running on agent1 and then delete the agent by selecting the respective option from the Actions for Selected drop-down menu.
  3. In Cloudera Manager, navigate to ECS Cluster > ECS and click Add Role Instances.
  4. Add the available host agent1 as an ECS server in the Add Role Instances to ECS pop-up. Click Ok.
  5. Click Continue.
  6. Start the new ECS server from ECS Instances view. For example, start ECSServer on agent1.
  7. On the command line, uncordon the node by running the following command:
    sudo /var/lib/rancher/rke2/bin/kubectl --kubeconfig=/etc/rancher/rke2/rke2.yaml uncordon agent1.example.com
  8. Confirm the node’s status from webUI or the command line by running the following command:
    sudo /var/lib/rancher/rke2/bin/kubectl --kubeconfig=/etc/rancher/rke2/rke2.yaml get nodes