Specifying hosts to improve Hive replication policy performance

When your cluster has Hive clients installed on hosts with limited resources and the Hive/Impala replication policies use these hosts to run commands for the replication, the replication job performance might degrade. To improve the replication job performance, you can specify the hosts to use during replication so that the lower-resource hosts are not used.

  1. Go to the Cloudera Manager > Clusters > Hive service > Configuration tab.
  2. Locate the Hive Replication Environment Advanced Configuration Snippet (Safety Valve) property.
  3. Add the HOST_WHITELIST property and enter a comma-separated list of hostnames to use for Hive/Impala replication policies.
    For example, HOST_WHITELIST=host-1.mycompany.com,host-2.mycompany.com.
  4. Click Save Changes.