Troubleshooting classic cluster registration errors (CCMv2)

While trying to resume the registration process, you can identify the problem behind the error and fix it.

Issues during registration in CDP

Error or issue details Resolution
Alert: Registration is pending for a cluster with the same details.
  • A cluster with the same IP Address and Data Center cannot be registered again.

  • Check if you have registered this cluster already. To check this, navigate to the Classic Clusters page and search for your cluster.

  • Check if the IP address is correct.

  • Provide a different Data Center name.

Cluster side issues

Error or issue details Resolution
Connection refused even though the systemctl status jumpgate-agent.service shows that the jumpgate agent is running. Make sure that you copied the right setup files for the cluster or check if the port number CCM_TUNNEL_SERVICE_PORT in cluster_connectivity.conf is your Cloudera Manager's port number in case of CDH/CDP Private Cloud Base cluster and Knox port number in case of HDP cluster.
Test connection failure

Try the following:

  • Check if the jumpgate agent is running:

    systemctl status jumpgate-agent.service

  • If the jumpgate agent is not running, start the agent by running the install script.

  • If the agent is active, check if the port number entered during registration is correct.

  • If the port number is incorrect, delete the registration attempt from the UI, remove all the setup-related files from the cluster node and re-register the cluster with the correct information.

  • If the port number is correct, check if outbound connection to the CDP Control Plane is allowed.
    • cat cluster_connectivity.conf and collect the RELAY_SERVER
    • Check if the RELAY_SERVER is reachable from the node:
      • On the node, execute the command nslookup <RELAY_SERVER> to check if the RELAY_SERVER is reachable from the node.

        Or execute curl command as mentioned here for connectivity check: curl https://<RELAY_SEVRER>:443

        Or execute telnet command as mentioned here: telnet <RELAY_SERVER> 443
      • If this fails, then the traffic to the cloudera network is blocked on the customer’s VPC.
      • You should check the outbound rules on your VPC to make sure that the traffic to the Cloudera network is allowed. If the agent is active, check if there are any Ranger policies set up to deny access to the cluster. If such policies exist on the cluster, modify or set up policies to allow access to the cluster cdp_default topology
  • If the agent is active, check if there are any Ranger policies set up to deny access to the cluster. If such policies exist on the cluster, modify or set up policies to allow access to the cluster cdp_default topology
  • If the tunnel is active, check if the port number entered during registration is correct.
  • If the port number is incorrect, delete the registration attempt from the UI, remove all the setup-related file from the cluster node and re-register the cluster with the correct information.
  • If the port number is correct, check if outbound connection to CDP control plane (host/port = CCM_HOST/CCM_SSH_PORT) is allowed on the directory which has the ssh setup files.
    • cat reverse_tunnel.conf or cat cluster_connectivity.conf and collect the following properties - CCM_HOST, CCM_SSH_PORT, CCM_TUNNEL_INITIATOR_ID
    • Check if the NLB is reachable from the node:
      • On the node, execute the command nslookup <CCM_HOST> to check if the NLB is reachable from the node.
      • If this fails, then the traffic to the cloudera network is blocked on the customer’s VPC.
      • The customer should check the outbound rules on their VPC to make sure that the traffic to the cloudera network is allowed.

CCM issues

Classic Clusters registration uses CCM, enabled by the installation of the CCM client and secrets on the CM node.

Error or issue details Resolution

“Unable to identify a backendId for this request. This usually happens if either the backendId is invalid or the agent has not yet connected to the relay server”

OR

“No route found for backend with id <backend-id>. This usually happens if either the backendId is invalid or the agent has not yet connected to the relay server."

The CCMv2 Network Load Balancers (NLBs) might be unreachable. To verify, execute cat cluster_connectivity.conf and collect the RELAY_SERVER field’s value and make sure that the RELEY_SERVER on port 443 is reachable from the legacy CDH/HDP master nodes.

Other issues

Error or issue details Resolution
If Cluster name and Display name are different, few details are missing from the cluster detail page. Cluster name and Display name must be the same.

If registering your cluster for use with Data Catalog, also check the following:

  • Check Proxy to Cloudera Manager through Apache Knox is enabled.

  • Make sure that you created the cdp_default topology on the Knox host with required services and LDAP configuration; it's a prerequisite.