Step 4: Create and Deploy the Kerberos Principals and Keytab Files
A Kerberos principal is used in a Kerberos-secured system to represent a unique identity. Kerberos assigns tickets to Kerberos principals to enable them to access Kerberos-secured Hadoop services. For Hadoop, the principals should be of the format username/fully.qualified.domain.name@YOUR-REALM.COM. In this guide, the term username in the username/fully.qualified.domain.name@YOUR-REALM.COM principal refers to the username of an existing Unix account, such as hdfs or mapred.
A keytab is a file containing pairs of Kerberos principals and an encrypted copy of that principal's key. The keytab files are unique to each host since their keys include the hostname. This file is used to authenticate a principal on a host to Kerberos without human interaction or storing a password in a plain text file. Because having access to the keytab file for a principal allows one to act as that principal, access to the keytab files should be tightly secured. They should be readable by a minimal set of users, should be stored on local disk, and should not be included in machine backups, unless access to those backups is as secure as access to the local machine.
For both MRv1 and YARN deployments: On every machine in your cluster, there must be a keytab file for the hdfs user and a keytab file for the mapred user. The hdfs keytab file must contain entries for the hdfs principal and a HTTP principal, and the mapred keytab file must contain entries for the mapred principal and a HTTP principal. On each respective machine, the HTTP principal will be the same in both keytab files.
In addition, for YARN deployments only: On every machine in your cluster, there must be a keytab file for the yarn user. The yarn keytab file must contain entries for the yarn principal and a HTTP principal. On each respective machine, the HTTP principal in the yarn keytab file will be the same as the HTTP principal in the hdfs and mapred keytab files.
The following instructions illustrate an example of creating keytab files for MIT Kerberos. If you are using another version of Kerberos, refer to your Kerberos documentation for instructions. You may use either kadmin or kadmin.local to run these commands.
When to Use kadmin.local and kadmin
When creating the Kerberos principals and keytabs, you can use kadmin.local or kadmin depending on your access and account:
- If you have root access to the KDC machine, but you don't have a Kerberos admin account, use kadmin.local.
- If you don't have root access to the KDC machine, but you do have a Kerberos admin account, use kadmin.
- If you have both root access to the KDC machine and a Kerberos admin account, you can use either one.
To start kadmin.local (on the KDC machine) or kadmin from any machine, run this command:
$ sudo kadmin.local
OR:
$ kadmin
In this guide, kadmin is shown as the prompt for commands in the kadmin shell, but you can type the same commands at the kadmin.local prompt in the kadmin.local shell.
Running kadmin.local may prompt you for a password because it is being run via sudo. You should provide your Unix password. Running kadmin may prompt you for a password because you need Kerberos admin privileges. You should provide your Kerberos admin password.
To create the Kerberos principals
If you plan to use Oozie, Impala, or the Hue Kerberos ticket renewer in your cluster, you must configure your KDC to allow tickets to be renewed, and you must configure krb5.conf to request renewable tickets. Typically, you can do this by adding the max_renewable_life setting to your realm in kdc.conf, and by adding the renew_lifetime parameter to the libdefaults section of krb5.conf. For more information about renewable tickets, see the Kerberos documentation.
Do the following steps for every host in your cluster. Run the commands in the kadmin.local or kadmin shell, replacing the fully.qualified.domain.name in the commands with the fully qualified domain name of each host. Replace YOUR-REALM.COM with the name of the Kerberos realm your Hadoop cluster is in.
- In the kadmin.local or kadmin shell, create the hdfs principal. This
principal is used for the NameNode, Secondary NameNode, and DataNodes.
kadmin: addprinc -randkey hdfs/fully.qualified.domain.name@YOUR-REALM.COM
Note: If your Kerberos administrator or company has a policy about principal names that does not allow you to use the format shown above, you can work around that issue by configuring the <kerberos principal> to <short name> mapping that is built into Hadoop. For more information, see Appendix C - Configuring the Mapping from Kerberos Principals to Short Names.
- Create the mapred principal. If you are using MRv1, the
mapred principal is used for the JobTracker and
TaskTrackers. If you are using YARN, the mapred principal is
used for the MapReduce Job History Server.
kadmin: addprinc -randkey mapred/fully.qualified.domain.name@YOUR-REALM.COM
- YARN only: Create the yarn principal. This principal is
used for the ResourceManager and NodeManager.
kadmin: addprinc -randkey yarn/fully.qualified.domain.name@YOUR-REALM.COM
- Create the HTTP principal.
kadmin: addprinc -randkey HTTP/fully.qualified.domain.name@YOUR-REALM.COM
Important: The HTTP principal must be in the format HTTP/fully.qualified.domain.name@YOUR-REALM.COM. The first component of the principal must be the literal string "HTTP". This format is standard for HTTP principals in SPNEGO and is hard-coded in Hadoop. It cannot be deviated from.
To create the Kerberos keytab files
The instructions in this section for creating keytab files require using the Kerberos norandkey option in the xst command. If your version of Kerberos does not support the norandkey option, or if you cannot use kadmin.local, then use these alternate instructions in Appendix F to create appropriate Kerberos keytab files. After using those alternate instructions to create the keytab files, continue with the next section To deploy the Kerberos keytab files.
Do the following steps for every host in your cluster. Run the commands in the kadmin.local or kadmin shell, replacing the fully.qualified.domain.name in the commands with the fully qualified domain name of each host:
- Create the hdfs keytab file that will contain the hdfs
principal and HTTP principal. This keytab file is used for the
NameNode, Secondary NameNode, and DataNodes.
kadmin: xst -norandkey -k hdfs.keytab hdfs/fully.qualified.domain.name HTTP/fully.qualified.domain.name
- Create the mapred keytab file that will contain the
mapred principal and HTTP principal. If
you are using MRv1, the mapred keytab file is used for the
JobTracker and TaskTrackers. If you are using YARN, the mapred
keytab file is used for the MapReduce Job History Server.
kadmin: xst -norandkey -k mapred.keytab mapred/fully.qualified.domain.name HTTP/fully.qualified.domain.name
- YARN only: Create the yarn keytab file that will contain
the yarn principal and HTTP principal. This
keytab file is used for the ResourceManager and NodeManager.
kadmin: xst -norandkey -k yarn.keytab yarn/fully.qualified.domain.name HTTP/fully.qualified.domain.name
- Use klist to display the keytab file entries; a
correctly-created hdfs keytab file should look something like this:
$ klist -e -k -t hdfs.keytab Keytab name: WRFILE:hdfs.keytab slot KVNO Principal ---- ---- --------------------------------------------------------------------- 1 7 HTTP/fully.qualified.domain.name@YOUR-REALM.COM (DES cbc mode with CRC-32) 2 7 HTTP/fully.qualified.domain.name@YOUR-REALM.COM (Triple DES cbc mode with HMAC/sha1) 3 7 hdfs/fully.qualified.domain.name@YOUR-REALM.COM (DES cbc mode with CRC-32) 4 7 hdfs/fully.qualified.domain.name@YOUR-REALM.COM (Triple DES cbc mode with HMAC/sha1)
- Continue with the next section To deploy the Kerberos keytab files.
To deploy the Kerberos keytab files
On every node in the cluster, repeat the following steps to deploy the hdfs.keytab and mapred.keytab files. If you are using YARN, you will also deploy the yarn.keytab file.
- On the host machine, copy or move the keytab files to a directory that Hadoop
can access, such as /etc/hadoop/conf.
-
If you are using MRv1:
$ sudo mv hdfs.keytab mapred.keytab /etc/hadoop/conf/
If you are using YARN:
$ sudo mv hdfs.keytab mapred.keytab yarn.keytab /etc/hadoop/conf/
- Make sure that the hdfs.keytab file is only readable by the
hdfs user, and that the mapred.keytab file
is only readable by the mapred user.
$ sudo chown hdfs:hadoop /etc/hadoop/conf/hdfs.keytab $ sudo chown mapred:hadoop /etc/hadoop/conf/mapred.keytab $ sudo chmod 400 /etc/hadoop/conf/*.keytab
Note: To enable you to use the same configuration files on every host, Cloudera recommends that you use the same name for the keytab files on every host.
- YARN only: Make sure that the yarn.keytab file is only
readable by the yarn user.
$ sudo chown yarn:hadoop /etc/hadoop/conf/yarn.keytab $ sudo chmod 400 /etc/hadoop/conf/yarn.keytab
Important: If the NameNode, Secondary NameNode, DataNode, JobTracker, TaskTrackers, HttpFS, or Oozie services are configured to use Kerberos HTTP SPNEGO authentication, and two or more of these services are running on the same host, then all of the running services must use the same HTTP principal and keytab file used for their HTTP endpoints.
-
<< Step 3: If you are Using AES-256 Encryption, install the JCE Policy File | Step 5: Shut Down the Cluster >> | |