Configuring HDFS properties to optimize log collection

CDP uses “out_webhdfs” Fluentd output plugin to write records into HDFS, in the form of log files, which are then used by different Data Services to generate diagnostic bundles. Over time, these log files can grow in size. To optimize the size of logs that are captured and stored on HDFS, you must update certain HDFS configurations in the hdfs-site.xml file using Cloudera Manager.

  1. Log in to Cloudera Manager as an Administrator.
  2. Go to Clusters > HDFS service > Configuration.
  3. Select the Enable WebHDFS (dfs_webhdfs_enabled) option.
  4. Add the following lines in the HDFS Service Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml field by clicking View as XML to enable append operations:
    <property>
      <name>dfs.support.append</name>
      <value>true</value>
    </property>
    
    <property>
      <name>dfs.support.broken.append</name>
      <value>true</value>
    </property>
  5. Click Save Changes.
  6. Restart the HDFS service.
  7. Restart your CDP cluster.