Hadoop Users in CDH 5

A number of special users are created by default when installing and using CDH & Cloudera Manager. Given below is a list of users and groups as of the latest CDH 5.1.x release. Also listed below are the corresponding Kerberos principals and keytab files that should be created when you configure Kerberos security on your cluster.

Table 1. CDH 5 Users & Groups


Unix User ID

Primary Group

Group Members


Apache Avro


No special users.

Apache Flume

flume flume

The sink that writes to HDFS as this user must have write privileges.

Apache HBase

hbase hbase

The Master and the RegionServer processes run as this user.


hdfs hdfs impala

The NameNode and DataNodes run as this user, and the HDFS root directory as well as the directories used for edit logs should be owned by it.

Apache Hive

hive hive impala

The HiveServer2 process and the Hive Metastore processes run as this user.

A user must be defined for Hive access to its Metastore DB (e.g. MySQL or Postgres) but it can be any identifier and does not correspond to a Unix uid. This is javax.jdo.option.ConnectionUserName in hive-site.xml.

Apache HCatalog

hive hive

The WebHCat service (for REST access to Hive functionality) runs as the hive user. It is not configurable.


httpfs httpfs

The HttpFS service runs as this user.

*See HttpFS Security Configuration for instructions on how to generate the merged httpfs-http.keytab file.


hue hue

Hue runs as this user. It is not configurable.

Cloudera Impala

impala impala

An interactive query tool.


llama llama  

Apache Mahout


No special users.


mapred mapred

Without Kerberos, the JobTracker and tasks run as this user. The LinuxTaskController binary is owned by this user for Kerberos. It would be complicated to use a different user ID.

Apache Oozie

oozie oozie  

The Oozie service runs as this user.



No special users.

Apache Pig


No special users.

Cloudera Search

solr solr

The Solr process runs as this user. It is not configurable.

Apache Spark

spark spark

The Spark process runs as this user. It is not configurable.

Apache Sentry (incubating)

sentry sentry  

The Sentry service runs as this user.

Apache Sqoop

sqoop sqoop

This user is only for the Sqoop1 Metastore, a configuration option that is not recommended.

Apache Sqoop2

sqoop2 sqoop

The Sqoop2 service runs as this user.

Apache Whirr


No special users.


yarn yarn

Without Kerberos, all YARN services and applications run as this user. The LinuxContainerExecutor binary is owned by this user for Kerberos. It would be complicated to use a different user ID.

Apache ZooKeeper

zookeeper zookeeper

The ZooKeeper process runs as this user. It is not configurable.

Other hadoop yarn, hdfs, mapred

This is a group with no associated Unix user ID or keytab.


The Kerberos principal names should be of the format, username/, where the term username refers to the username of an existing UNIX account, such as hdfs or mapred. The table below lists the usernames to be used for the Kerberos principal names. For example, the Kerberos principal for Apache Flume would be flume/

Table 2. CDH 5 Keytabs and Keytab File Permissions
Project (UNIX ID) Service Kerberos Principal Primary Filename (.keytab) Keytab File Owner Keytab File Group File Permission (octal)
Flume (flume) flume-AGENT flume flume flume flume 600
HBase (hbase) hbase-REGIONSERVER hbase hbase hbase hbase 600
HDFS (hdfs) hdfs-NAMENODE hdfs hdfs

Secondary: Merge hdfs and HTTP

hdfs hdfs 600
Hive (hive) hive-HIVESERVER2 hive hive hive hive 600
hive-HIVEMETASTORE hive hive
HttpFS (httpfs) hdfs-HTTPFS httpfs httpfs httpfs httpfs 600
Hue (hue) hue-KT_RENEWER hue hue hue hue 600
Impala (impala) impala-STATESTORE impala impala impala impala 600
Llama (llama) impala-LLAMA llama llama

Secondary: Merge llama and HTTP

llama llama 600
MapReduce (mapred) mapreduce-JOBTRACKER mapred mapred

Secondary: Merge mapred and HTTP

mapred hadoop 600
mapreduce- TASKTRACKER
Oozie (oozie) oozie-OOZIE_SERVER oozie oozie

Secondary: Merge oozie and HTTP

oozie oozie 600
Search (solr) solr-SOLR_SERVER solr solr

Secondary: Merge solr and HTTP

solr solr 600
Sentry (sentry) sentry-SENTRY_SERVER sentry sentry sentry sentry 600
Spark (spark) spark_on_yarn-SPARK _YARN_HISTORY_SERVER spark spark spark spark 600
Sqoop (sqoop)            
Sqoop2 (sqoop2)            
YARN (yarn) yarn-NODEMANAGER yarn yarn

Secondary: Merge yarn and HTTP

yarn hadoop 644
ZooKeeper (zookeeper) zookeeper-server zookeeper zookeeper zookeeper zookeeper 600
