Hadoop Users (user:group) and Kerberos Principals
During the Cloudera Manager/CDH installation process, several Linux user accounts and groups are created by default. These are listed in the table below. Integrating the cluster to use Kerberos for authentication requires creating Kerberos principals and keytabs for these user accounts.
Component (Version) |
Unix User ID | Groups | Functionality |
---|---|---|---|
Cloudera Manager (all versions) | cloudera-scm | cloudera-scm | Clusters managed by Cloudera Manager run Cloudera Manager Server, monitoring roles, and other Cloudera Server processes
as cloudera-scm.
Requires keytab file named cmf.keytab because name is hard-coded in Cloudera Manager. |
Apache Accumulo | accumulo | accumulo | Accumulo processes run as this user. |
Apache Flume | flume | flume | The sink that writes to HDFS as user must have write privileges. |
Apache HBase | hbase | hbase | The Master and the RegionServer processes run as this user. |
HDFS | hdfs | hdfs, hadoop | The NameNode and DataNodes run as this user, and the HDFS root directory as well as the directories used for edit logs should be owned by it. |
Apache Hive | hive | hive |
The HiveServer2 process and the Hive Metastore processes run as this user. A user must be defined for Hive access to its Metastore DB (for example, MySQL or Postgres) but it can be any identifier and does not correspond to a Unix uid. This is javax.jdo.option.ConnectionUserName in hive-site.xml. |
Apache HCatalog | hive | hive |
The WebHCat service (for REST access to Hive functionality) runs as the hive user. |
HttpFS | httpfs | httpfs |
The HttpFS service runs as this user. See HttpFS Security Configuration for instructions on how to generate the merged httpfs-http.keytab file. |
Hue | hue | hue |
Hue services run as this user. |
Hue Load Balancer | apache | apache | The Hue Load balancer has a dependency on the apache2 package that uses the apache user name. Cloudera Manager does not run processes using this user ID. |
Impala | impala | impala, hive | Impala services run as this user. |
Apache Kafka | kafka | kafka | Kafka brokers and mirror makers run as this user. |
Java KeyStore KMS | kms | kms | The Java KeyStore KMS service runs as this user. |
Key Trustee KMS | kms | kms | The Key Trustee KMS service runs as this user. |
Key Trustee Server | keytrustee | keytrustee | The Key Trustee Server service runs as this user. |
Kudu | kudu | kudu | Kudu services run as this user. |
MapReduce | mapred | mapred, hadoop | Without Kerberos, the JobTracker and tasks run as this user. The LinuxTaskController binary is owned by this user for Kerberos. |
Apache Oozie | oozie | oozie | The Oozie service runs as this user. |
Parquet | ~ | ~ | No special users. |
Apache Pig | ~ | ~ | No special users. |
Cloudera Search | solr | solr | The Solr processes run as this user. |
Apache Spark | spark | spark | The Spark History Server process runs as this user. |
Apache Sentry | sentry | sentry | The Sentry service runs as this user. |
Apache Sqoop | sqoop | sqoop | This user is only for the Sqoop1 Metastore, a configuration option that is not recommended. |
YARN | yarn | yarn, hadoop | Without Kerberos, all YARN services and applications run as this user. The LinuxContainerExecutor binary is owned by this user for Kerberos. |
Apache ZooKeeper | zookeeper | zookeeper | The ZooKeeper processes run as this user. It is not configurable. |
Keytabs and Keytab File Permissions
username/host.example.com@EXAMPLE.COMFor example, the Kerberos principal for Apache Flume would be:
flume/host.example.com@EXAMPLE.COM
Keytabs that contain multiple principals are merged automatically from individual keytabs by Cloudera Manager. If you do not use Cloudera Manager, you must merge the keytabs manually.
The table below lists the usernames to use for Kerberos principal names.
Component (Unix User ID) | Service | Kerberos Principals | Filename (*.keytab) | Keytab File Owner | Keytab File Group | File Permission (octal) |
---|---|---|---|---|---|---|
Cloudera Manager (cloudera-scm) | NA | cloudera-scm | cmf | cloudera-scm | cloudera-scm | 600 |
Cloudera Management Service (cloudera-scm) | cloudera-mgmt- REPORTSMANAGER | hdfs | headlamp | cloudera-scm | cloudera-scm | 600 |
Cloudera Management Service (cloudera-scm) | cloudera-mgmt- SERVICEMONITOR, cloudera-mgmt- ACTIVITYMONITOR | hue | cmon | cloudera-scm | cloudera-scm | 600 |
Cloudera Management Service (cloudera-scm) | cloudera-mgmt- HOSTMONITOR | N/A | N/A | N/A | N/A | N/A |
Apache Accumulo (accumulo) | accumulo16-ACCUMULO16_MASTER | accumulo | accumulo16 | cloudera-scm | cloudera-scm | 600 |
accumulo16-ACCUMULO16_TRACER | ||||||
accumulo16-ACCUMULO16_MONITOR | ||||||
accumulo16-ACCUMULO16_GC | ||||||
accumulo16-ACCUMULO16_TSERVER | ||||||
Flume (flume) | flume-AGENT | flume | flume | cloudera-scm | cloudera-scm | 600 |
HBase (hbase) | hbase-HBASETHRIFTSERVER | HTTP | HTTP | cloudera-scm | cloudera-scm | 600 |
hbase-REGIONSERVER | hbase | hbase | ||||
hbase-HBASERESTSERVER | ||||||
hbase-MASTER | ||||||
HDFS (hdfs) | hdfs-NAMENODE | hdfs, HTTP | hdfs | cloudera-scm | cloudera-scm | 600 |
hdfs-DATANODE | ||||||
hdfs- SECONDARYNAMENODE | ||||||
Hive (hive) | hive-HIVESERVER2 | hive | hive | cloudera-scm | cloudera-scm | 600 |
hive-WEBHCAT | HTTP | HTTP | ||||
hive-HIVEMETASTORE | hive | hive | ||||
HttpFS (httpfs) | hdfs-HTTPFS | httpfs | httpfs | cloudera-scm | cloudera-scm | 600 |
Hue (hue) | hue-KT_RENEWER | hue | hue | cloudera-scm | cloudera-scm | 600 |
Impala (impala) | impala-STATESTORE | impala | impala | cloudera-scm | cloudera-scm | 600 |
impala-CATALOGSERVER | ||||||
impala-IMPALAD | ||||||
Java KeyStore KMS (kms) | kms-KMS | HTTP | kms | cloudera-scm | cloudera-scm | 600 |
Apache Kafka (kafka) | kafka-KAFKA_BROKER | kafka | kafka | kafka | kafka | 600 |
Apache Kafka (kafka) | kafka-KAFKA_MIRROR_MAKER | kafka_mirror_maker | kafka | kafka | kafka | 600 |
Key Trustee KMS (kms) | keytrustee-KMS_KEYTRUSTEE | HTTP | keytrustee | cloudera-scm | cloudera-scm | 600 |
MapReduce (mapred) | mapreduce-JOBTRACKER | mapred, HTTP | mapred | cloudera-scm | cloudera-scm | 600 |
mapreduce- TASKTRACKER | ||||||
Apache Kudu (kudu) | kudu-KUDU_MASTER | kudu | kudu | kudu | kudu | 600 |
kudu-KUDU_TSERVER | ||||||
Oozie (oozie) | oozie-OOZIE_SERVER | oozie, HTTP | oozie | cloudera-scm | cloudera-scm | 600 |
Search (solr) | solr-SOLR_SERVER | solr, HTTP | solr | cloudera-scm | cloudera-scm | 600 |
Sentry (sentry) | sentry-SENTRY_SERVER | sentry | sentry | cloudera-scm | cloudera-scm | 600 |
Spark (spark) | spark_on_yarn- SPARK_YARN_HISTORY_SERVER | spark | spark | cloudera-scm | cloudera-scm | 600 |
YARN (yarn) | yarn-NODEMANAGER | yarn, HTTP | yarn | cloudera-scm | cloudera-scm | 644 |
yarn- RESOURCEMANAGER | 600 | |||||
yarn-JOBHISTORY | mapred | mapred | 600 | |||
ZooKeeper (zookeeper) | zookeeper-server | zookeeper | zookeeper | cloudera-scm | cloudera-scm | 600 |
Component (Unix User ID) | Service | Kerberos Principals | Filename (*.keytab) | Keytab File Owner | Keytab File Group | File Permission (octal) |
---|---|---|---|---|---|---|
Apache Accumulo (accumulo) | accumulo16-ACCUMULO16_MASTER | accumulo | accumulo16 | accumulo | accumulo | 600 |
accumulo16-ACCUMULO16_TRACER | ||||||
accumulo16-ACCUMULO16_MONITOR | ||||||
accumulo16-ACCUMULO16_GC | ||||||
accumulo16-ACCUMULO16_TSERVER | ||||||
Flume (flume) | flume-AGENT | flume | flume | flume | flume | 600 |
HBase (hbase) | hbase-HBASETHRIFTSERVER | HTTP | HTTP | hbase | hbase | 600 |
hbase-REGIONSERVER | hbase | hbase | ||||
hbase-HBASERESTSERVER | ||||||
hbase-MASTER | ||||||
HDFS (hdfs) | hdfs-NAMENODE | hdfs, HTTP | hdfs | hdfs | hdfs | 600 |
hdfs-DATANODE | ||||||
hdfs- SECONDARYNAMENODE | ||||||
Hive (hive) | hive-HIVESERVER2 | hive | hive | hive | hive | 600 |
hive-WEBHCAT | HTTP | HTTP | ||||
hive-HIVEMETASTORE | hive | hive | ||||
HttpFS (httpfs) | hdfs-HTTPFS | httpfs | httpfs | httpfs | httpfs | 600 |
Hue (hue) | hue-KT_RENEWER | hue | hue | hue | hue | 600 |
Impala (impala) | impala-STATESTORE | impala | impala | impala | impala | 600 |
impala-CATALOGSERVER | ||||||
impala-IMPALAD | ||||||
Java KeyStore KMS (kms) | kms-KMS | HTTP | kms | kms | kms | 600 |
Apache Kafka (kafka) | kafka-KAFKA_BROKER | kafka | kafka | kafka | kafka | 600 |
Apache Kafka (kafka) | kafka-MIRROR_MAKER | kafka_mirror_maker | kafka | kafka | kafka | 600 |
Key Trustee KMS (kms) | kms-KEYTRUSTEE | HTTP | kms | kms | kms | 600 |
MapReduce (mapred) | mapreduce-JOBTRACKER | mapred, HTTP | mapred | mapred | hadoop | 600 |
mapreduce- TASKTRACKER | ||||||
Apache Kudu | kudu-KUDU_MASTER | kudu | kudu | kudu | kudu | 600 |
kudu-KUDU_TSERVER | ||||||
Oozie (oozie) | oozie-OOZIE_SERVER | oozie, HTTP | oozie | oozie | oozie | 600 |
Search (solr) | solr-SOLR_SERVER | solr, HTTP | solr | solr | solr | 600 |
Sentry (sentry) | sentry-SENTRY_SERVER | sentry | sentry | sentry | sentry | 600 |
Spark (spark) | spark_on_yarn- SPARK_YARN_HISTORY_SERVER | spark | spark | spark | spark | 600 |
YARN (yarn) | yarn-NODEMANAGER | yarn, HTTP | yarn | yarn | hadoop | 644 |
yarn- RESOURCEMANAGER | 600 | |||||
yarn-JOBHISTORY | mapred | mapred | mapred | 600 | ||
ZooKeeper (zookeeper) | zookeeper-server | zookeeper | zookeeper | zookeeper | zookeeper | 600 |