To the hdfs-site.xml file on every host in your cluster, you must add the following information:
Table 27.6. hdfs-site.xml File Property Settings
Property Name | Property Value | Description |
---|---|---|
dfs.permissions.enabled | true | If true, permission checking in HDFS is enabled. If false, permission checking is turned off, but all other behavioris unchanged. Switching from one parameter value to the other does not change the mode, owner or group of files or directories. |
dfs.permissions.supergroup | hdfs | The name of the group of super-users. |
dfs.block.access.token.enable | true | If true, access tokens are used as capabilities for accessing DataNodes. If false, no access tokens are checked on accessing DataNodes. |
dfs.namenode.kerberos.principal | nn/_HOST@EXAMPLE.COM | Kerberos principal name for the NameNode. |
dfs.secondary.namenode.kerberos. principal | nn/_HOST@EXAMPLE.COM | Kerberos principal name for the secondary NameNode. |
dfs.web.authentication.kerberos. principal | HTTP/_HOST@EXAMPLE.COM | The HTTP Kerberos principal used by Hadoop-Auth in the HTTP endpoint. The HTTP Kerberos principal MUST start with 'HTTP/' per Kerberos HTTP SPNEGO specification. |
dfs.web.authentication.kerberos. keytab | /etc/security/keytabs/spnego.service.keytab | The Kerberos keytab file with the credentials for the HTTP Kerberos principal used by Hadoop-Auth in the HTTP endpoint. |
dfs.datanode.kerberos.principal | dn/_HOST@EXAMPLE.COM | The Kerberos principal that the DataNode runs as. "_HOST" is replaced by the real host name. |
dfs.namenode.keytab.file | /etc/security/keytabs/nn.service.keytab | Combined keytab file containing the NameNode service and host principals. |
dfs.secondary.namenode.keytab.file | /etc/security/keytabs/nn.service.keytab | Combined keytab file containing the NameNode service and host principals. <question?> |
dfs.datanode.keytab.file | /etc/security/keytabs/dn.service.keytab | The filename of the keytab file for the DataNode. |
dfs.https.port | 50470 | The HTTPS port to which the NameNode binds. |
dfs.namenode.https-address | Example: ip-10-111-59-170.ec2.internal:50470 | The HTTPS address to which the NameNode binds. |
dfs.datanode.data.dir.perm | 750 | The permissions that must be set on the dfs.data.dir directories. The DataNode will not come up if all existing dfs.data.dir directories do not have this setting. If the directories do not exist, they will be created with this permission. |
dfs.cluster.administrators | hdfs | ACL for who all can view the default servlets in the HDFS. |
dfs.namenode.kerberos.internal. spnego.principal | ${dfs.web.authentication.kerberos.principal} |
|
dfs.secondary.namenode.kerberos. internal.spnego.principal | ${dfs.web.authentication.kerberos.principal} |
|
Following is the XML for these entries:
<property> <name>dfs.permissions</name> <value>true</value> <description> If "true", enable permission checking in HDFS. If "false", permission checking is turned off, but all other behavior is unchanged. Switching from one parameter value to the other does not change the mode, owner or group of files or directories. </description> </property> <property> <name>dfs.permissions.supergroup</name> <value>hdfs</value> <description>The name of the group of super-users.</description> </property> <property> <name>dfs.namenode.handler.count</name> <value>100</value> <description>Added to grow Queue size so that more client connections are allowed</description> </property> <property> <name>ipc.server.max.response.size</name> <value>5242880</value> </property> <property> <name>dfs.block.access.token.enable</name> <value>true</value> <description> If "true", access tokens are used as capabilities for accessing datanodes. If "false", no access tokens are checked on accessing datanodes. </description> </property> <property> <name>dfs.namenode.kerberos.principal</name> <value>nn/_HOST@EXAMPLE.COM</value> <description> Kerberos principal name for the NameNode </description> </property> <property> <name>dfs.secondary.namenode.kerberos.principal</name> <value>nn/_HOST@EXAMPLE.COM</value> <description>Kerberos principal name for the secondary NameNode. </description> </property> <property> <!--cluster variant --> <name>dfs.secondary.http.address</name> <value>ip-10-72-235-178.ec2.internal:50090</value> <description>Address of secondary namenode web server</description> </property> <property> <name>dfs.secondary.https.port</name> <value>50490</value> <description>The https port where secondary-namenode binds</description> </property> <property> <name>dfs.web.authentication.kerberos.principal</name> <value>HTTP/_HOST@EXAMPLE.COM</value> <description> The HTTP Kerberos principal used by Hadoop-Auth in the HTTP endpoint. The HTTP Kerberos principal MUST start with 'HTTP/' per Kerberos HTTP SPNEGO specification. </description> </property> <property> <name>dfs.web.authentication.kerberos.keytab</name> <value>/etc/security/keytabs/spnego.service.keytab</value> <description>The Kerberos keytab file with the credentials for the HTTP Kerberos principal used by Hadoop-Auth in the HTTP endpoint. </description> </property> <property> <name>dfs.datanode.kerberos.principal</name> <value>dn/_HOST@EXAMPLE.COM</value> <description> The Kerberos principal that the DataNode runs as. "_HOST" is replaced by the real host name. </description> </property> <property> <name>dfs.namenode.keytab.file</name> <value>/etc/security/keytabs/nn.service.keytab</value> <description> Combined keytab file containing the namenode service and host principals. </description> </property> <property> <name>dfs.secondary.namenode.keytab.file</name> <value>/etc/security/keytabs/nn.service.keytab</value> <description> Combined keytab file containing the namenode service and host principals. </description> </property> <property> <name>dfs.datanode.keytab.file</name> <value>/etc/security/keytabs/dn.service.keytab</value> <description> The filename of the keytab file for the DataNode. </description> </property> <property> <name>dfs.https.port</name> <value>50470</value> <description>The https port where namenode binds</description> </property> <property> <name>dfs.https.address</name> <value>ip-10-111-59-170.ec2.internal:50470</value> <description>The https address where namenode binds</description> </property> <property> <name>dfs.datanode.data.dir.perm</name> <value>750</value> <description>The permissions that should be there on dfs.data.dir directories. The datanode will not come up if the permissions are different on existing dfs.data.dir directories. If the directories don't exist, they will be created with this permission.</description> </property> <property> <name>dfs.access.time.precision</name> <value>0</value> <description>The access time for HDFS file is precise upto this value.The default value is 1 hour. Setting a value of 0 disables access times for HDFS. </description> </property> <property> <name>dfs.cluster.administrators</name> <value> hdfs</value> <description>ACL for who all can view the default servlets in the HDFS</description> </property> <property> <name>ipc.server.read.threadpool.size</name> <value>5</value> <description></description> </property> <property> <name>dfs.namenode.kerberos.internal.spnego.principal</name> <value>${dfs.web.authentication.kerberos.principal}</value> </property> <property> <name>dfs.secondary.namenode.kerberos.internal.spnego.principal</name> <value>${dfs.web.authentication.kerberos.principal}</value> </property>
In addition, you must set the user on all secure DataNodes:
export HADOOP_SECURE_DN_USER=hdfs
export HADOOP_SECURE_DN_PID_DIR=/grid/0/var/run/hadoop/$HADOOP_SECURE_DN_USER