Limiting replication hosts

If your cluster has clients installed on hosts with limited resources, HDFS replication may use these hosts to run commands for the replication, which can cause performance degradation. You can limit HDFS replication to run only on selected DataNodes by specifying a "whitelist" of DataNode hosts.

  1. Click Clusters > HDFS service > Configuration.
  2. Type HDFS Replication in the search box.
  3. Locate the HDFS Replication Environment Advanced Configuration Snippet (Safety Valve) property.
  4. Add the HOST_WHITELIST property. Enter a comma-separated list of DataNode hostnames to use for HDFS replication. For example:
    HOST_WHITELIST=host-1.mycompany.com,host-2.mycompany.com
  5. Click Save Changes to commit the changes.