Command Line Upgrade
Also available as:
PDF
loading table of contents...

Configure YARN and MapReduce

After you upgrade Hadoop, complete the following steps to update your configs.

[Note]Note

The su commands in this section use keywords to represent the Service user. For example, "hdfs" is used to represent the HDFS Service user. If you are using another name for your Service users, you need to substitute your Service user name in each of the su commands.

[Important]Important

If you have a secure server, you need Kerberos credentials for hdfs user access.

  1. Ensure that all HDFS directories configured in yarn-site.xml configuration files (for example, yarn.timeline-service.entity-group-fs-store.active-dir, yarn.timeline-service.entity-group-fs-store.done-dir) in HDFS.

  2. Upload the MapReduce tarball to HDFS. As the HDFS user, for example 'hdfs':

    su - hdfs -c "hdfs dfs -mkdir -p /hdp/apps/2.5.3.0-<$version>/mapreduce/"

    su - hdfs -c "hdfs dfs -put /usr/hdp/2.5.3.0-<$version>/hadoop/mapreduce.tar.gz /hdp/apps/2.5.3.0-<$version>/mapreduce/"

    su - hdfs -c "hdfs dfs -chown -R hdfs:hadoop /hdp"

    su - hdfs -c "hdfs dfs -chmod -R 555 /hdp/apps/2.5.3.0-<$version>/mapreduce"

    su - hdfs -c "hdfs dfs -chmod -R 444 /hdp/apps/2.5.3.0-<$version>/mapreduce/mapreduce.tar.gz"

  3. Make the following changes to /etc/hadoop/conf/mapred-site.xml:

    • Add:

      <property>
       <name>mapreduce.application.framework.path</name> 
       <value>/hdp/apps/${hdp.version}
         /mapreduce/mapreduce.tar.gz#mr-framework
       </value>
      </property>
                        
      <property>
       <name>yarn.app.mapreduce.am.admin-command-opts</name> 
       <value>Dhdp.version=${hdp.version}</value>
      </property>                  
      [Note]Note

      You do not need to modify ${hdp.version}.

    • Modify the following existing properties to include ${hdp.version}:

      <property>
       <name>mapreduce.admin.user.env</name>
       <value>LD_LIBRARY_PATH=/usr/hdp/${hdp.version}
         /hadoop/lib/native:/usr/hdp/${hdp.version}/hadoop/
           lib/native/Linux-amd64-64
         </value>
      </property>
       
      <property>
       <name>mapreduce.admin.map.child.java.opts</name>
       <value>-server -Djava.net.preferIPv4Stack=true 
         -Dhdp.version=${hdp.version}
         </value>
       <final>true</final>
      </property>
       
      <property>
       <name>mapreduce.admin.reduce.child.java.opts</name>
       <value>-server -Djava.net.preferIPv4Stack=true -Dhdp.version=${hdp.version}</value>
       <final>true</final>
      </property>
       
      <property>
       <name>mapreduce.application.classpath</name> 
       <value>$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*,
         $PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*,
         $PWD/mr-framework/hadoop/share/hadoop/common/*,
         $PWD/mr-framework/hadoop/share/hadoop/common/lib/*,
         $PWD/mr-framework/hadoop/share/hadoop/yarn/*,
         $PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*,
         $PWD/mr-framework/hadoop/share/hadoop/hdfs/*,
         $PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*,
        /usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar,
        /etc/hadoop/conf/secure</value>
      </property>
      [Note]Note

      You do not need to modify ${hdp.version}.

    • Remove the following properties from /etc/hadoop/conf/mapred-site.xml: mapreduce.task.tmp.dir, mapreduce.job.speculative.slownodethreshold (deprecated), and mapreduce.job.speculative.speculativecap (deprecated).

  4. Add the following properties to /etc/hadoop/conf/yarn-site.xml:

    <property>
     <name>yarn.application.classpath</name> 
     <value>$HADOOP_CONF_DIR,/usr/hdp/${hdp.version}/hadoop-client/*,
      /usr/hdp/${hdp.version}/hadoop-client/lib/*,
      /usr/hdp/${hdp.version}/hadoop-hdfs-client/*,
      /usr/hdp/${hdp.version}/hadoop-hdfs-client/lib/*,
      /usr/hdp/${hdp.version}/hadoop-yarn-client/*,
      /usr/hdp/${hdp.version}/hadoop-yarn-client/lib/*</value>
    </property>
                  
  5. On secure clusters only, add the following properties to /etc/hadoop/conf/yarn-site.xml:

     <property>
      <name>yarn.timeline-service.recovery.enabled:</name>
      <value>TRUE</value>
      </property>
      
      <property>
      <name>yarn.timeline-service.state-store.class: org.apache.hadoop.yarn.server.timeline.recovery:</name>
      <value>LeveldbTimelineStateStore</value>
      </property> 
      
      <property>
      <name>yarn.timeline-service.leveldb-state-store.path:</name>
      <value><the same as the default of "yarn.timeline-service-leveldb-timeline-store.path</value>
      </property>
  6. Modify the following property to /etc/hadoop/conf/yarn-site.xml:

    <property>
     <name>mapreduce.application.classpath</name>
     <value>$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*,
       $PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*,
       $PWD/mr-framework/hadoop/share/hadoop/common/*,
       $PWD/mr-framework/hadoop/share/hadoop/common/lib/*,
       $PWD/mr-framework/hadoop/share/hadoop/yarn/*,
       $PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*,
       $PWD/mr-framework/hadoop/share/hadoop/hdfs/*,
       $PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*,
       $PWD/mr-framework/hadoop/share/hadoop/share/hadoop/tools/lib/*,
      /usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar,
      /etc/hadoop/conf/secure</value>
    </property>
                  
  7. Make the following change to the /etc/hadoop/conf/yarn-env.sh:

    Change export HADOOP_YARN_HOME=/usr/lib/hadoop-yarn

    to

    export HADOOP_YARN_HOME=/usr/hdp/current/hadoop-yarn-nodemanager/

  8. Make the following change to the /etc/hadoop/conf/yarn-env.sh:

    Change

    export HADOOP_LIBEXEC_DIR=/usr/lib/hadoop/libexec

    to

    HADOOP_LIBEXEC_DIR=/usr/hdp/current/hadoop-client/libexec/

  9. For secure clusters, you must create and configure the container-executor.cfg configuration file:

    • Create the container-executor.cfg file in /etc/hadoop/conf/.

    • Insert the following properties:

      yarn.nodemanager.linux-container-executor.group=hadoop 
      banned.users=hdfs,yarn,mapred 
      min.user.id=1000
      • yarn.nodemanager.linux-container-executor.group - Configured value of yarn.nodemanager.linux-container-executor.group. This must match the value of yarn.nodemanager.linux-container-executor.group in yarn-site.xml.

      • banned.users - Comma-separated list of users who can not run container-executor.

      • min.user.id - Minimum value of user id. This prevents system users from running container-executor.

      • allowed.system.users - Comma-separated list of allowed system users.

    • Set the file /etc/hadoop/conf/container-executor.cfg file permissions to only be readable by root:

      chown root:hadoop /etc/hadoop/conf/container-executor.cfg
      chmod 400 /etc/hadoop/conf/container-executor.cfg
    • Set the container-executor program so that only root or hadoop group users can run it:

      chown root:hadoop /usr/hdp/${hdp.version}/hadoop-yarn-server-nodemanager/bin/container-executor
                          
      chmod 6050 /usr/hdp/${hdp.version}/hadoop-yarn-server-nodemanager/bin/container-executor