Configuring a user to run YARN jobs on both the clusters

To run Hadoop DistCp jobs to migrate the data from HDP to CDP Private Cloud Base cluster, you must use HDFS superuser or hdfs user.

Ensure that you have one of the following user accounts before you run Hadoop DistCp jobs:
  • HDFS superuser - For information about creating a HDFS superuser, see Create the HDFS superuser.
  • User named hdfs - By default, the hdfs user is not allowed to run YARN jobs. You must enable the hdfs user to run YARN jobs on both the clusters.
  1. Perform the following steps on the HDP cluster:
    1. Open the following file:
      /var/lib/ambari-server/resources/common-services/YARN/2.1.0.2.0/package/templates/container-executor.cfg.j2
    2. Remove the hdfs entry from banned-users list and save the file.
      Sample file contents:
      yarn.nodemanager.local-dirs={{nm_local_dirs}}
      yarn.nodemanager.log-dirs={{nm_log_dirs}}
      yarn.nodemanager.linux-container-executor.group={{yarn_executor_container_group}}
      banned.users=yarn,hdfs,mapred,bin
      min.user.id={{min_user_id}}
      
    3. On the YARN configuration page, verify whether the container-executor configuration template contains hdfs in the banned.users list.
    4. If hdfs is listed in the banner.users list, remove it from the template and save the template.
    5. Restart the following services:
      • Stale services, if any.
      • Ambari server
      • Ambari agent on each host of the cluster.
    6. In the yarn.admin.acl file, add hdfs.
    7. In the etc/hadoop/capacity-scheduler.xml fileSearch file, append hdfs to the yarn.scheduler.capacity.root.acl_submit_applications property.
    8. Restart the YARN service.
    9. Run the kinit command with the hdfs user’s keytab file to authenticate the hdfs user to the Key Distribution Center (KDC).
  2. On the CDP Private Cloud Base cluster, perform the following steps:
    1. Select the YARN service.
    2. Click the Configuration tab.
    3. Make sure that hdfs user is not listed in the banned.users list.
    4. Make sure that the min.user.id property is set to 0.
    5. Restart the YARN service.
Run the DistCp job on the CDP Private Cloud Base cluster.