Replicating directories with thousands of files and subdirectories

Before you replicate the data in directories that has thousands of files and subdirectories, increase the heap size in the hadoop-env.sh file.

  1. To increase the heap size, go to the HDFS service page on the destination Cloudera Manager instance.
  2. Click the Configuration tab.
  3. Expand SCOPE and select HDFS service name (Service-Wide) option.
  4. Expand CATEGORY and select Advanced.
  5. Locate the HDFS Replication Environment Advanced Configuration Snippet (Safety Valve) for hadoop-env.sh property.
  6. To increase the heap size, add the key-value pair HADOOP_CLIENT_OPTS=-Xmx<memory_value>. For example, if you enter HADOOP_CLIENT_OPTS=-Xmx1g, the heap size is set to 1 GB. This value should be adjusted depending on the number of files and directories being replicated.
  7. Enter a Reason for change, and then click Save Changes to commit the changes.