Host Selection for Hive/Impala Replication

If your cluster has Hive clients installed on hosts with limited resources, Hive/Impala replication may use these hosts to run commands for the replication, which can cause the performance of the replication to degrade.

To improve performance, you can specify the hosts (a ”white list”) to use during replication so that the lower-resource hosts are not used.
  1. In Cloudera Manager, navigate to the Clusters > HDFS Service > Configuration page.
  2. Locate the Hive Replication Environment Advanced Configuration Snippet (Safety Valve) property.
  3. Add the HOST_WHITELIST property. Enter a comma-separated list of hostnames to use for Hive/Impala replication.
    HOST_WHITELIST=host-1.mycompany.com,host-2.mycompany.com
  4. Enter a Reason for change, and then click Save Changes to commit the changes.