Command Line Upgrade
Also available as:
PDF
loading table of contents...

Configure YARN and MapReduce

After you upgrade Hadoop, complete the following steps to update your configs.

[Note]Note

The su commands in this section use keywords to represent the Service user. For example, "hdfs" is used to represent the HDFS Service user. If you are using another name for your Service users, you need to substitute your Service user name in each of the su commands.

[Important]Important

In secure mode, you must have Kerberos credentials for the hdfs user.

  1. Upload the MapReduce tarball to HDFS. As the HDFS user, for example 'hdfs':

    su - hdfs -c "hdfs dfs -mkdir -p /hdp/apps/2.5.3.0-<$version>/mapreduce/"

    su - hdfs -c "hdfs dfs -put /usr/hdp/2.5.3.0-<$version>/hadoop/mapreduce.tar.gz /hdp/apps/2.5.3.0-<$version>/mapreduce/"

    su - hdfs -c "hdfs dfs -chown -R hdfs:hadoop /hdp"

    su - hdfs -c "hdfs dfs -chmod -R 555 /hdp/apps/2.5.3.0-<$version>/mapreduce"

    su - hdfs -c "hdfs dfs -chmod -R 444 /hdp/apps/2.5.3.0-<$version>/mapreduce/mapreduce.tar.gz"

  2. Make sure that the following properties are in /etc/hadoop/conf/mapred-site.xml:

    • Make sure mapreduce.application.framework.path exists in mapred-site.xml:

      <property>
       <name>mapreduce.application.framework.path</name> 
       <value>/hdp/apps/${hdp.version}/mapreduce/mapreduce.tar.gz#mr-framework</value>
      </property>
                        
      <property>
       <name>yarn.app.mapreduce.am.admin-comand-opts</name> 
       <value>-Dhdp.version=${hdp.version}</value>
      </property> 
      [Note]Note

      You do not need to modify ${hdp.version}.

    • Modify the following existing properties to include ${hdp.version}:

      <property>
       <name>mapreduce.admin.user.env</name>
       <value>LD_LIBRARY_PATH=/usr/hdp/${hdp.version}/hadoop/lib/native:/usr/hdp/${hdp.version}/hadoop/
           lib/native/Linux-amd64-64</value>
      </property>
       
      <property>
       <name>mapreduce.admin.map.child.java.opts</name>
       <value>-server -Djava.net.preferIPv4Stack=true -Dhdp.version=${hdp.version}</value>
       <final>true</final>
      </property>
       
      <property>
       <name>mapreduce.admin.reduce.child.java.opts</name>
       <value>-server -Djava.net.preferIPv4Stack=true -Dhdp.version=${hdp.version}</value>
       <final>true</final>
      </property>
       
      <property>
       <name>mapreduce.application.classpath</name> 
       <value>$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*,
         $PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*,
         $PWD/mr-framework/hadoop/share/hadoop/common/*,
         $PWD/mr-framework/hadoop/share/hadoop/common/lib/*,
         $PWD/mr-framework/hadoop/share/hadoop/yarn/*,
         $PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*,
         $PWD/mr-framework/hadoop/share/hadoop/hdfs/*,
         $PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*,
        /usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar,
        /etc/hadoop/conf/secure</value>
      </property>
      [Note]Note

      You do not need to modify ${hdp.version}.

      [Note]Note

      If you are planning to use Spark in yarn-client mode, make Spark work in yarn-client mode 2.5.3.0-<$version>.

  3. Make sure the following property is in /etc/hadoop/conf/yarn-site.xml:

    <property>
     <name>yarn.application.classpath</name> 
     <value>$HADOOP_CONF_DIR,/usr/hdp/${hdp.version}/hadoop-client/*,
      /usr/hdp/${hdp.version}/hadoop-client/lib/*,
      /usr/hdp/${hdp.version}/hadoop-hdfs-client/*,
      /usr/hdp/${hdp.version}/hadoop-hdfs-client/lib/*,
      /usr/hdp/${hdp.version}/hadoop-yarn-client/*,
      /usr/hdp/${hdp.version}/hadoop-yarn-client/lib/*</value>
    </property>
  4. On secure clusters only, add the following properties to /etc/hadoop/conf/yarn-site.xml:

     <property>
      <name>yarn.timeline-service.recovery.enabled</name>
      <value>TRUE</value>
      </property>
      
      <property>
      <name>yarn.timeline-service.state-store.class</name>
      <value>org.apache.hadoop.yarn.server.timeline.recovery.LeveldbTimelineStateStore</value>
      </property> 
      
      <property>
      <name>yarn.timeline-service.leveldb-state-store.path</name>
      <value><the same as the default of "yarn.timeline-service-leveldb-timeline-store.path</value>
      </property>
  5. For secure clusters, you must create and configure the container-executor.cfg configuration file:

    • Create the container-executor.cfg file in /etc/hadoop/conf/container-executor.cfg.

    • Insert the following properties:

      <property>
      yarn.nodemanager.linux-container-executor.group=hadoop 
      banned.users=hdfs,yarn,mapred 
      min.user.id=1000
      </property>
      • yarn.nodemanager.linux-container-executor.group - Configured value of yarn.nodemanager.linux-container-executor.group. This must match the value of yarn.nodemanager.linux-container-executor.group in yarn-site.xml.

      • banned.users - Comma-separated list of users who can not run container-executor.

      • min.user.id - Minimum value of user id. This prevents system users from running container-executor.

      • allowed.system.users - Comma-separated list of allowed system users.

    • Set the file /etc/hadoop/conf/container-executor.cfg file permissions to only be readable by root:

      chown root:hadoop /etc/hadoop/conf/container-executor.cfg
      chmod 400 /etc/hadoop/conf/container-executor.cfg
    • Set the container-executor program so that only root or hadoop group users can run it:

      chown root:hadoop /usr/hdp/${hdp.version}/hadoop-yarn/bin/container-executor
                          
      chmod 6050 /usr/hdp/${hdp.version}/hadoop-yarn/bin/container-executor