Step 7: Prepare the Cluster for Each User
Before you and other users can access the cluster, there are a few tasks
you must do to prepare the hosts for each user.
- Make sure all hosts in the cluster have a Linux user account with
the same name as the first component of that user's principal name. For example, the Linux
account joe should exist on every box if
the user's principal name is joe@YOUR-REALM.COM. You can use LDAP for this step if it is available in your
organization. Note
: Each account must have a user ID that is greater than or equal to 1000. In the /etc/hadoop/conf/taskcontroller.cfg file, the default setting for the banned.users property is mapred, hdfs, and bin to prevent jobs from being submitted via those user accounts. The default setting for the min.user.id property is 1000 to prevent jobs from being submitted with a user ID less than 1000, which are conventionally Unix super users. - Create a subdirectory under /user on HDFS for each user account (for example, /user/joe). Change the owner and group of that
directory to be the user.
$ hadoop fs -mkdir /user/joe $ hadoop fs -chown joe /user/joe
Note: sudo -u hdfs
is not included in the commands above. This is because it is not required if Kerberos is
enabled on your cluster. You will, however, need to have Kerberos credentials for the HDFS
super user in order to successfully run these commands. For information on gaining access to
the HDFS super user account, see Step 14: Create the HDFS Superuser Principal
<< Step 6: Get or Create a Kerberos Principal for Each User Account | Step 8: Verify that Kerberos Security is Working >> | |