Hadoop Users (user:group) and Kerberos Principals

During the Cloudera Manager installation process, several Linux user accounts and groups are created by default. These are listed in the table below. Integrating the cluster to use Kerberos for authentication requires creating Kerberos principals and keytabs for these user accounts.

Table 1. Users and Groups
Component (Version) Unix User ID Groups Functionality
Apache Atlas atlas atlas, hadoop Apache Atlas by default has atlas as user and group. It is configurable
Apache Flink flink flink The Flink Dashboard runs as this user.
Apache HBase hbase hbase The Master and the RegionServer processes run as this user.
Apache HBase Indexer hbase hbase The indexer servers are run as this user.
Apache HDFS hdfs hdfs, hadoop The NameNode and DataNodes run as this user, and the HDFS root directory as well as the directories used for edit logs should be owned by it.

Apache Hive

Hive on Tez

hive hive The HiveServer2 process and the Hive Metastore processes run as this user.A user must be defined for Hive access to its Metastore DB (for example, MySQL or Postgres) but it can be any identifier and does not correspond to a Unix uid. This is javax.jdo.option.ConnectionUserName in hive-site.xml.
Apache Impala impala impala, hive Impala services run as this user.
Apache Kafka kafka kafka Kafka brokers, mirrorMaker, and Connect workers run as this user.
Apache Knox knox knox Apache Knox Gateway Server runs as this user
Apache Kudu kudu kudu Kudu services run as this user.
Apache Livy livy livy The Livy Server process runs as this user
Apache NiFi nifi nifi Runs as the nifi user
Apache NiFi Registry nifiregistry nifiregistry Runs as the nifiregistry user
Apache Oozie oozie oozie The Oozie service runs as this user.
Apache Ozone hdfs hdfs, hadoop Ozone Manager, Storage Container Manager (SCM), Recon and Ozone Datanodes run as this user.
Apache Parquet ~ ~ No special users.
Apache Phoenix phoenix phoenix The Phoenix Query Server runs as this user
Apache Ranger ranger ranger, hadoop Ranger Admin, Usersync and Tagsync services by default have ranger as user and ranger, hadoop as groups. It is configurable.
Apache Ranger KMS kms kms Ranger KMS runs with kms user and group. It is configurable.
Apache Ranger Raz rangerraz ranger Ranger Raz runs with rangerraz user and is part of the ranger group.
Apache Ranger RMS rangerrms ranger Ranger RMS runs with rangerrms user and is part of the ranger group.
Apache Solr solr solr The Solr processes run as this user.
Apache Spark spark spark The Spark History Server process runs as this user.
Apache Sqoop sqoop sqoop This user is only for the Sqoop1 Metastore, a configuration option that is not recommended.
Apache YARN yarn yarn, hadoop Without Kerberos, all YARN services and applications run as this user. The LinuxContainerExecutor binary is owned by this user for Kerberos.
Apache Zeppelin zeppelin zeppelin The Zeppelin Server process runs as this user
Apache ZooKeeper zookeeper zookeeper The ZooKeeper processes run as this user. It is not configurable.
Cloudera Manager (all versions) cloudera-scm cloudera-scm Clusters managed by Cloudera Manager run Cloudera Manager Server, monitoring roles, and other Cloudera Server processes as cloudera-scm. Requires keytab file named cmf.keytab because name is hard-coded in Cloudera Manager.
Cruise Control cruisecontrol hadoop The Cruise Control process runs as this user.
HttpFS httpfs httpfs The HttpFS service runs as this user. See HttpFS authentication for instructions on how to generate the merged httpfs-http.keytab file.
Hue hue hue Hue services run as this user.
Hue Load Balancer apache apache The Hue Load balancer has a dependency on the apache2 package that uses the apache user name. Cloudera Manager does not run processes using this user ID.
Key Trustee Server keytrustee keytrustee The Key Trustee Server service runs as this user.
Schema Registry schemaregistry hadoop The Schema Registry process runs as this user.
Streams Messaging Manager streamsmsgmgr streamsmsgmgr The Streams Messaging Manager processes runs as this user.
Streams Replication Manager streamsrepmgr streamsrepmgr The Streams Replication Manager processes runs as this user.

Keytabs and Keytab File Permissions

Linux user accounts, such as hdfs, are mapped to the username portion of the Kerberos principal names, as follows:
username/host.example.com@EXAMPLE.COM
For example, the Kerberos principal for Apache Hive would be:
hive/host.example.com@EXAMPLE.COM

Keytabs that contain multiple principals are merged automatically from individual keytabs by Cloudera Manager. If you override a service configuration to not use the CM-provided keytab, then you must ensure that all the principals required for the given role instance on a specific host are merged together in the keytab file you deploy manually on that host.

For example, for Filename (*.keytab), the Atlas keytab filename would be atlas.keytab, HBase would be hbase.keytab, and Cloudera Manager would be cmf.keytab and scm.keytab.

Keytab File Owner:Group matters when Cloudera Manager starts a role. For example, Cloudera Manager starts the role "DataNode"". Cloudera Manager launches the DataNode process as a user (here, "hdfs"). Because that process needs to access the HDFS keytab, Cloudera Manager puts the HDFS keytab in the DataNode's process directory, and the keytab is given the owner:group that is listed in the table. Thus, the DataNode process properly owns the keytab file.

The tables below lists the usernames to use for Kerberos principal names, for clusters managed by Cloudera Manager.

Apache Atlas

Role: atlas-ATLAS_SERVER
Kerberos Principals
atlas
Filename (*.keytab)
atlas
Keytab File Owner:Group
atlas:atlas
File Permission (octal)
600

Apache Flink

Role: flink
Kerberos Principals
flink
Filename (*.keytab)
flink
Keytab File Owner:Group
flink:flink
File Permission (octal)
600

Apache HBase

Role: hbase-HBASETHRIFTSERVER
Kerberos Principals
hbase, HTTP
Filename (*.keytab)
hbase, HTTP
Keytab File Owner:Group
hbase:hbase
File Permission (octal)
600
Role: hbase-REGIONSERVER
Kerberos Principals
hbase, HTTP
Filename (*.keytab)
hbase, HTTP
Keytab File Owner:Group
hbase:hbase
File Permission (octal)
600
Role: hbase-HBASERESTSERVER
Kerberos Principals
hbase, HTTP
Filename (*.keytab)
hbase, HTTP
Keytab File Owner:Group
hbase:hbase
File Permission (octal)
600
Role: hbase-MASTER
Kerberos Principals
hbase, HTTP
Filename (*.keytab)
hbase, HTTP
Keytab File Owner:Group
hbase:hbase
File Permission (octal)
600

Apache HBase indexer

Role: ks_indexer-HBASE_INDEXER
Kerberos Principals
hbase, HTTP
Filename (*.keytab)
hbase
Keytab File Owner:Group
hbase:hbase
File Permission (octal)
600

Apache HDFS

Role: hdfs-NAMENODE
Kerberos Principals
hdfs, HTTP
Filename (*.keytab)
hdfs
Keytab File Owner:Group
hdfs:hdfs
File Permission (octal)
600
Role: hdfs-DATANODE
Kerberos Principals
hdfs, HTTP
Filename (*.keytab)
hdfs
Keytab File Owner:Group
hdfs:hdfs
File Permission (octal)
600
Role: hdfs-SECONDARYNAMENODE
Kerberos Principals
hdfs, HTTP
Filename (*.keytab)
hdfs
Keytab File Owner:Group
hdfs:hdfs
File Permission (octal)
600

Apache Hive, Hive on Tez

Role: hive-HIVESERVER2
Kerberos Principals
hive
Filename (*.keytab)
hive
Keytab File Owner:Group
hive:hive
File Permission (octal)
600
Role: hive-HIVEMETASTORE
Kerberos Principals
hive
Filename (*.keytab)
hive
Keytab File Owner:Group
cloudera-scm:cloudera-scm
File Permission (octal)
600

Apache Impala

Role: impala-STATESTORE
Kerberos Principals
impala, HTTP
Filename (*.keytab)
impala
Keytab File Owner:Group
impala:impala
File Permission (octal)
600
Role: impala-CATALOGSERVER
Kerberos Principals
impala, HTTP
Filename (*.keytab)
impala
Keytab File Owner:Group
impala:impala
File Permission (octal)
600
Role: impala-IMPALAD
Kerberos Principals
impala, HTTP
Filename (*.keytab)
impala
Keytab File Owner:Group
impala:impala
File Permission (octal)
600

Apache Kafka

Role: kafka-KAFKA_BROKER
Kerberos Principals
kafka
Filename (*.keytab)
kafka
Keytab File Owner:Group
kafka:kafka
File Permission (octal)
600
Role: kafka-KAFKA_MIRROR_MAKER
Kerberos Principals
kafka_mirror_maker
Filename (*.keytab)
kafka
Keytab File Owner:Group
kafka:kafka
File Permission (octal)
600
Role: kafka-KAFKA_CONNECT
Kerberos Principals
kafka
Filename (*.keytab)
kafka
Keytab File Owner:Group
kafka:kafka
File Permission (octal)
600

Apache Knox

Role: knox-KNOX_GATEWAY
Kerberos Principals
knox, HTTP
Filename (*.keytab)
hbase
Keytab File Owner:Group
knox:knox
File Permission (octal)
600

Apache Kudu

Role: kudu-KUDU_MASTER
Kerberos Principals
kudu
Filename (*.keytab)
kudu
Keytab File Owner:Group
kudu:kudu
File Permission (octal)
600
Role: kudu-KUDU_TSERVER
Kerberos Principals
kudu
Filename (*.keytab)
kudu
Keytab File Owner:Group
kudu:kudu
File Permission (octal)
600

Apache Livy

Role: livy-LIVY_SERVER
Kerberos Principals
livy
Filename (*.keytab)
livy
Keytab File Owner:Group
livy:livy
File Permission (octal)
600

Apache NiFi

Role: nifi
Kerberos Principals
nifi, HTTP
Filename (*.keytab)
nifi
Keytab File Owner:Group
nifi:nifi
File Permission (octal)
600

Apache NiFi Registry

Role: nifiregistry
Kerberos Principals
nifiregistry, HTTP
Filename (*.keytab)
nifiregistry
Keytab File Owner:Group
nifiregistry:nifiregistry
File Permission (octal)
600

Apache Oozie

Role: oozie-OOZIE_SERVER
Kerberos Principals
oozie, HTTP
Filename (*.keytab)
oozie
Keytab File Owner:Group
oozie:oozie
File Permission (octal)
600

Apache Ozone

Role: ozone-OZONE_MANAGER
Kerberos Principals
om, HTTP
Filename (*.keytab)
ozone
Keytab File Owner:Group
hdfs:hdfs
File Permission (octal)
600
Role: ozone-STORAGE_CONTAINER_MANAGER
Kerberos Principals
scm, HTTP
Filename (*.keytab)
ozone
Keytab File Owner:Group
hdfs:hdfs
File Permission (octal)
600
Role: ozone-OZONE_DATANODE
Kerberos Principals
dn, HTTP
Filename (*.keytab)
ozone
Keytab File Owner:Group
hdfs:hdfs
File Permission (octal)
600
Role: ozone-OZONE_RECON
Kerberos Principals
recon, HTTP
Filename (*.keytab)
ozone
Keytab File Owner:Group
hdfs:hdfs
File Permission (octal)
600
Role: ozone-S3_GATEWAY
Kerberos Principals
HTTP
Filename (*.keytab)
ozone
Keytab File Owner:Group
hdfs:hdfs
File Permission (octal)
600

Apache Phoenix

Role: phoenix-PHOENIX_QUERY_SERVER
Kerberos Principals
phoenix, HTTP
Filename (*.keytab)
phoenix
Keytab File Owner:Group
phoenix:phoenix
File Permission (octal)
600

Apache Ranger

Role: ranger-RANGER_ADMIN
Kerberos Principals
rangeradmin, rangerlookup, HTTP
Filename (*.keytab)
ranger
Keytab File Owner:Group
ranger:ranger
File Permission (octal)
600
Role: ranger-RANGER_USERSYNC
Kerberos Principals
rangerusersync
Filename (*.keytab)
ranger
Keytab File Owner:Group
ranger:ranger
File Permission (octal)
600
Role: ranger-RANGER_TAGSYNC
Kerberos Principals
rangertagsync
Filename (*.keytab)
ranger
Keytab File Owner:Group
ranger:ranger
File Permission (octal)
600

Apache Ranger KMS

Role: ranger-RANGER_TAGSYNC
Kerberos Principals
rangerkms, HTTP
Filename (*.keytab)
ranger_kms
Keytab File Owner:Group
kms:kms
File Permission (octal)
600

Apache Ranger Raz

Role: ranger-RANGER_RAZ
Kerberos Principals
rangerraz, HTTP
Filename (*.keytab)
rangerraz
Keytab File Owner:Group
ranger:rangerraz
File Permission (octal)
600

Apache Ranger RMS

Role: ranger-RANGER_RMS
Kerberos Principals
rangerrms
Filename (*.keytab)
rangerrms
Keytab File Owner:Group
ranger:rangerrms
File Permission (octal)
600

Apache Solr

Role: solr-SOLR_SERVER
Kerberos Principals
solr, HTTP
Filename (*.keytab)
solr
Keytab File Owner:Group
solr:solr
File Permission (octal)
600

Apache Spark

Role: spark_on_yarn-SPARK_YARN_HISTORY_SERVER
Kerberos Principals
spark
Filename (*.keytab)
spark
Keytab File Owner:Group
spark:spark
File Permission (octal)
600

Apache YARN

Role: yarn-NODEMANAGER
Kerberos Principals
yarn, HTTP
Filename (*.keytab)
yarn
Keytab File Owner:Group
yarn:hadoop
File Permission (octal)
644
Role: yarn-RESOURCEMANAGER
Kerberos Principals
yarn, HTTP
Filename (*.keytab)
yarn
Keytab File Owner:Group
yarn:hadoop
File Permission (octal)
600
Role: yarn-JOBHISTORY
Kerberos Principals
mapred
Filename (*.keytab)
mapred
Keytab File Owner:Group
yarn:hadoop
File Permission (octal)
600

Apache Zeppelin

Role: zeppelin-ZEPPELIN_SERVER
Kerberos Principals
zeppelin, HTTP
Filename (*.keytab)
zeppelin
Keytab File Owner:Group
zeppelin:zeppelin
File Permission (octal)
600

Apache ZooKeeper

Role: zookeeper-server
Kerberos Principals
zookeeper
Filename (*.keytab)
zookeeper
Keytab File Owner:Group
zookeeper:zookeeper
File Permission (octal)
600

Cloudera Management

Role: cloudera-mgmt-REPORTSMANAGER
Kerberos Principals
hdfs
Filename (*.keytab)
headlamp
Keytab File Owner:Group
cloudera-scm:cloudera-scm
File Permission (octal)
600
Role: cloudera-mgmt-SERVICEMONITOR
Kerberos Principals
hue
Filename (*.keytab)
cmon
Keytab File Owner:Group
cloudera-scm:cloudera-scm
File Permission (octal)
600
Role: cloudera-mgmt-ACTIVITYMONITOR
Kerberos Principals
hue
Filename (*.keytab)
cmon
Keytab File Owner:Group
cloudera-scm:cloudera-scm
File Permission (octal)
600

Cloudera Manager

Kerberos Principals
cloudera-scm, HTTP
Filename (*.keytab)
cmf, scm
Keytab File Owner:Group
cloudera-scm:cloudera-scm
File Permission (octal)
600

CruiseControl

Role: cruise_control-CRUISE_CONTROL_SERVER
Kerberos Principals
cruisecontrol, kafka, HTTP
Filename (*.keytab)
cruise_control
Keytab File Owner:Group
cruisecontrol:hadoop
File Permission (octal)
600

HttpFS

Role: hdfs-HTTPFS
Kerberos Principals
httpfs, HTTP
Filename (*.keytab)
httpfs
Keytab File Owner:Group
httpfs:httpfs
File Permission (octal)
600

Hue

Role: hue-KT_RENEWER
Kerberos Principals
hue
Filename (*.keytab)
hue
Keytab File Owner:Group
hue:hue
File Permission (octal)
600

Schema Registry

Role: schemaregistry-SCHEMA_REGISTRY_SERVER
Kerberos Principals
schemaregistry, HTTP
Filename (*.keytab)
schemaregistry
Keytab File Owner:Group
schemaregistry:hadoop
File Permission (octal)
600

Streams Messaging Manager

Role: streams_messaging_manager-STREAMS_MESSAGING_MANAGER_SERVER
Kerberos Principals
streamsmsgmgr, HTTP
Filename (*.keytab)
streams_messaging_manager
Keytab File Owner:Group
streamsmsgmgr:streamsmsgmgr
File Permission (octal)
600

Streams Replication Manager

Role: streams_replication_manager-STREAMS_REPLICATION_MANAGER_DRIVER
Kerberos Principals
streamsrepmgr
Filename (*.keytab)
streams_replication_manager
Keytab File Owner:Group
streamsrepmgr:streamsrepmgr
File Permission (octal)
600
Role: streams_replication_manager-STREAMS_REPLICATION_MANAGER_SERVICE
Kerberos Principals
streamsrepmgr
Filename (*.keytab)
streams_replication_manager
Keytab File Owner:Group
streamsrepmgr:streamsrepmgr
File Permission (octal)
600

Create Service Principals and Keytab Files for HDP

This section is optional. During the HDP 2.6.5 to HDP intermediate bits upgrade, Ambari can generate the principals and keytabs. However, before upgrading, you can manually generate the principals and keytabs. First, create the principal using mandatory naming conventions and then create the keytab file with the principal's information. Lastly, copy the keytab file to the keytab directory on the appropriate service host.

To create a service principal, use the kadmin utility. The kadmin utility is a command-line driven utility where you can run Kerberos commands to manipulate the central database. To start kadmin, run the following commands:
  1. 'kadmin $USER/admin@REALM'
  2. kadmin: addprinc -randkey $principal_name/$service-host-FQDN@$hadoop.realm
In the example mentioned in step 2, each service principal's name is appended with a fully qualified domain name of the host on which the principal is running. This is to provide a unique principal name for services that run on multiple hosts, like DataNodes and TaskTrackers. The addition of the hostname serves to distinguish, for example, a request from DataNode A from a request from DataNode B. This is important for two reasons:
  • If the Kerberos credentials for one DataNode are compromised, it does not automatically compromise all other DataNodes.
  • If multiple DataNodes have the same principal and are simultaneously connecting to the NameNode, and if the Kerberos authenticator sent has the same timestamp, then the authentication is rejected as a replay request.

If you are configuring High Availability (HA) for a Quorum-based NameNode, you must also generate a principle (jn/$FQDN) and keytab (jn.service.keytab) for each JournalNode. JournalNode also requires the keytab for its HTTP service. If the JournalNode is deployed on the same host as a NameNode, the same keytab file (spnego.service.keytab) can be used for both. In addition, HA requires two NameNodes. Both the active and standby NameNodes require their own principal and keytab files. The service principles of the two NameNodes can share the same name, specified with the dfs.namenode.kerberos.principal property in hdfs-site.xml, but the NameNodes still have different fully qualified domain names.

Service Component/Role Principal Name Mandatory Keytab Filename
HDFS NameNode nn/$FQDN nn.service.keytab
SecondaryNameNode nn/$FQDN nn.service.keytab
DataNode dn/$FQDN dn.service.keytab
Journal Server* jn/$FQDN jn.service.keytab
NameNode HTTP HTTP/$FQDN spengo.service.keytab
SecondaryNameNode HTTP HTTP/$FQDN spnego.service.keytab
MapReduce MR2 History Server jhs/$FQDN nm.service.keytab
MR2 History Server HTTP HTTP/$FQDN spnego.service.keytab
YARN Node Manager nm/$FQDN nm.service.keytab
Resource Manager rm/$FQDN rm.service.keytab
YARN Timeline Server yarn-ats/$FQDN yarn-ats.service.keytab
HTTP HTTP/$FQDN spnego.service.keytab
Oozie Oozie Server oozie/$FQDN oozie..service.keytab
Oozie HTTP HTTP/$FQDN spnego.service.keytab
Hive HiverServer2, HMS hive/$FQDN hive.service.keytab
Hive HTTP HTTP/$FQDN spnego.service.keytab
HBase HBase Master Server hbase/$FQDN hbase.service.keytab
HBase RegionServer hbase/$FQDN hbase.service.keytab
Kafka Kafka Broker kafka/$FQDN kafka.service.keytab
Zeppelin Zeppelin Server zeppelin/$FQDN zeppelin.service.keytab
Zookeeper zookeeper/$FQDN zk.service.keytab
Knox knox/$FQDN knox.service.keytab
Ranger Admin Server rangeradmin/$FQDN rangeradmin.service.keytab
Lookup Server rangerlookup/$FQDN rangerlookup.service.keytab
KMS rangerkms/$FQDN rangerkms.service.keytab
UserSync rangerusersync/$FQDN rangerusersync.service.keytab
TagSync rangertagsync/$FQDN rangertagsync.service.keytab
AMS amshbase/$FQDN ams-hbase.master.keytab
amsmon/$FQDN ams.collector.keytab
amszk/$FQDN ams-zk.service.keytab
Spark2 spark/$FQDN spark.service.keytab
Druid druid/$FQDN druid.service.keytab
Infra-Solr infra-solr/$FQDN ambari-infra-solr.service.keytab
Atlas atlas/$FQDN atlas.service.keytab
Livy livy/$FQDN livy.service.keytab

* Only required if you are setting up NameNode HA. For example, to create the principal for a DataNode service, run the command kadmin: addprinc -randkey dn/$datanode-host@$hadoop.realm

3. Extract the related keytab file and place it in the keytab directory of the respective components. The default directory is /etc/krb5.keytab.
  • kadmin: xst -k $keytab_file_name $principal_name/fully.qualified.domain.name

    You must use the mandatory names for the $keytab_file_name variable shown in the table above. For example, to create the keytab files for the NameNode, run the command kadmin: xst -k nn.service.keytab nn/$namenode-host kadmin: xst -k spnego.service.keytab HTTP/$namenode-host

    After creating the keytab files, copy the keytab files to the keytab directory of the respective service hosts.

4. On each service in your cluster, verify that the correct keytab files and principals are associated with the correct service using the klist command. For example, on the NameNode, run the command klist –k -t /etc/security/nn.service.keytab