2.1. File Locations

  • Configuration files: These files are used to configure a hadoop cluster.

    1. core-site.xml:

      All Hadoop services and clients use this file to locate the NameNode.  Therefore, this file must be copied to each node that is either running a Hadoop service or is a client.  The Secondary NameNode uses this file to determine location for storing fsimage and edits log <name>fs.checkpoint.dir</name> locally and location of the NameNode <name>fs.default.name</name>.  Use the core-site.xml file to isolate communication issues with the NameNode host machine.

    2. hdfs-site.xml:

      HDFS services use this file. Some important properties of this file are as listed below:

      • HTTP addresses for the two services

      • Replication for DataNodes   <name>dfs.replication</name>

      • DataNode block storage location <name>dfs.data.dir</name>

      • NameNode metadata storage <name>dfs.name.dir</name>

      Use hdfs-site.xml file to isolate NameNode startup issues. Typically, NameNode startup issues are caused when NameNode fails to load the fsimage and edits log to merge. Ensure that the values for all the above properties in hdfs-site.xml file are valid locations.

    3. datanode.xml:

      DataNode services use the datanode.xml file to specify the maximum and minimum heap size for the DataNode service. To troubleshoot issues with DataNode, change the value for -Xmx to change the maximum heap size for DataNode service and restart the affected DataNode host machine.

    4. namenode.xml:

      NameNode services use the namenode.xml file to specify the maximum and minimum heap size for the NameNode service. To troubleshoot issues with NameNode, change the value for -Xmx to change the maximum heap size for NameNode service and restart the affected NameNode host machine.

    5. secondarynamenode.xml:

      Secondary NameNode services use the secondarynamenode.xml file to specify the maximum and minimum heap size for the Secondary NameNode service. To troubleshoot issues with Secondary NameNode, change the value for -Xmx to change the maximum heap size for Secondary NameNode service and restart the affected Secondary NameNode host machine.

    6. hadoop-policy.xml:

      Use the hadoop-policy.xml file to configure service-level authorization/ACLs within Hadoop. NameNode accesses this file. Use this file to troubleshoot permission related issues for NameNode.

    7. log4j.properties:

      Use the log4j.properties file to modify the log purging intervals of the HDFS logs. This file defines logging for all the Hadoop services.  It includes, information related to appenders used for logging and layout.  See log4j documentation for more details.

  • Log Files: The following are sets of log files for each of the HDFS services. They are typically stored in C:\hadoop\logs\hadoop and C:\hdp\hadoop-1.1.0-SNAPSHOT\bin by default.

    • HDFS .out files: The log files with the .out extension for HDFS services are located in C:\hdp\hadoop-1.1.0-SNAPSHOT\bin and have the following naming convention:

      • datanode.out.log

      • namenode.out.log

      • secondarynamenode.out.log

      These files are created and written to when HDFS services are bootstrapped. Use these files to isolate launch issues with DataNode, NameNode, or Secondary NameNode services.

    • HDFS .wrapper files: The log files with the .wrapper extension are located in C:\hdp\hadoop-1.1.0-SNAPSHOT\bin and have the following file names:

      • datanode.wrapper.log

      • namenode.wrapper.log

      • secondarynamenode.wrapper.log

      These files contain startup command string to start the service and they also provide the output of the process ID on service startup.

    • HDFS .log and .err files:

      The following files are located in C:\hdp\hadoop-1.1.0-SNAPSHOT\bin:

      • datanode.err.log

      • namenode.err.log

      • secondarynamenode.err.log

      following files are located in C:\hadoop\logs\hadoop:

      • hadoop-datanode-$Hostname.log

      • hadoop-namenode-$Hostname.log

      • hadoop-secondarynamenode-$Hostname.log

      These files contain log messages for the running Java service. If there are any errors encountered while the service is already running, the stack trace of the error is logged in the above files.

      $Hostname is the host where the service is running. For example, on a node where the hostname is namemode.example.com, the file would be saved as hadoop-namenode-namemodehost.example.com.log.

      [Note]Note

      By default, these log files are rotated daily. Use C:\hdp\hadoop-1.1.0-SNAPSHOT\conf\log4j.properties file to change log rotation duration.

    • HDFS .<date> files:

      The log files with the .<date> extension for HDFS services have the following format:

      • hadoop-namenode-$Hostname.log.<date>

      • hadoop-datanode-$Hostname.log.<date>

      • hadoop-secondarynamenode-$Hostname.log.<date>   

      When a .log file is rotated, it is appended with the current date.  An example of the file name would be: hadoop-datanode-hdp121.localdomain.com.log.2013-02-08.

      Use these files to compare the past state of your cluster with the current state in order to troubleshoot potential patterns of occurrence.


loading table of contents...