Ports for autoscaling nodes during HBase replication

You must make sure that the ports 2181 and 16020 are open on the destination worker nodes to support autoscaling nodes during HBase replication.

To open the required ports for Zookeeper and HBase during autoscaling nodes for HBase replication, you can use one of the following methods:

  • Choose a security group for your environment and open the ports manually. In this method, you choose the security groups that are automatically created for the environment. By default, the security groups do not have any rules for Zookeeper and HBase ports, therefore, you must open the required ports manually after you create a Data Hub.

    After you open the ports, the required security groups are assigned to the nodes when the nodes are autoscaled. This is a one-time process that you must perform when you create a Data Hub.

  • Predefine a security group with the required ports open, and assign it to the new Data Hub environment. In this method, you create a predefined security group for a VPC that contains inbound rules to open the required ports which include Zookeeper and HBase ports. When you create an environment, you assign the predefined security group to it. If required, you can assign different security groups to the gateway node and other nodes.

    This method allows you to reuse the predefined security groups in other new Data Hubs. Security issues do not appear because the nodes in the same security group do not access each other by default. However, if required, you can add a separate rule to impose this restriction. Sharing the same security group for inbound and outbound network access rules remains as strict as having separate security groups for each environment, but the extra rules for Zookeeper and HBase ports do not need to be added at each environment creation.

The following use cases illustrate the situations where a requirement for autoscaling nodes for HBase replication might appear:
  • You replicate HBase data to another CDP account or region in the same cloud provider. In this use case, ensure that the VPC/VNET peering is complete before you open the ports to establish connection over private networks.
  • You replicate HBase data to COD or Data Hub using a direct connection. In this use case, you ensure that public IPs and Zookeeper ports are not open to the internet.