Configuring heap size to replicate large directories using replication policies

Before you replicate the data in directories that has thousands of files and subdirectories, increase the heap size in the hadoop-env.sh file.

  1. Go to the destination Cloudera Manager > HDFS service > Configuration tab.
  2. Locate the HDFS Replication Environment Advanced Configuration Snippet (Safety Valve) for hadoop-env.sh property.
  3. Enter the HADOOP_CLIENT_OPTS=-Xmx[***required_heap_size***] key-value pair.
    For example, if you enter HADOOP_CLIENT_OPTS=-Xmx1g, the heap size is set to 1 GB. Adjust the heap size depending on the number of files and directories being replicated.
  4. Click Save Changes.
  5. Restart the HDFS service.