Step 1: Install Cloudera Manager and CDH

Cloudera strongly recommends that you install and configure the Cloudera Manager Server and Cloudera Manager Agents and CDH to set up a fully-functional CDH cluster before trying to configure Kerberos authentication for the cluster.

Required user:group Settings for Security

When you install the CDH packages and the Cloudera Manager Agents on your cluster hosts, Cloudera Manager takes some steps to provide system security such as creating the OS user:group accounts and setting directory permissions as shown in the table below. These user accounts and directory permissions work with the Hadoop Kerberos security requirements.
This User Runs These Roles
hdfs NameNode, DataNodes, and Secondary NameNode
mapred JobTracker and TaskTrackers (MR1) and Job History Server (YARN)
yarn ResourceManager and NodeManagers (YARN)
oozie Oozie Server
hue Hue Server, Beeswax Server, Authorization Manager, and Job Designer
The hdfs user has HDFS superuser privileges.

When you install the Cloudera Manager Server on the server host, a new Unix user account called cloudera-scm is created automatically to support security. The Cloudera Manager Server uses this account to create host principals and deploy the keytabs on your cluster.

Depending on whether you installed CDH and Cloudera Manager at the same time or not, use one of the following sections for information on configuring directory ownerships on cluster hosts:

New Installation, Cloudera Manager and CDH Together

Installing a new Cloudera Manager cluster with CDH components at the same time can save you some of the user:group configuration required if you install them separately. The installation process creates the necessary user accounts on the Linux host system for the service daemons. At the end of the installation process when each cluster node starts up, the Cloudera Manager Agent process on the host automatically configures the directory ownership as shown in the table below, and the Hadoop daemons can then automatically set permissions for their respective directories. Do not change the directory owners on the cluster. They must be configured exactly as shown below.

Directory Specified in this Property Owner
dfs.name.dir hdfs:hadoop
dfs.data.dir hdfs:hadoop
mapred.local.dir mapred:hadoop
mapred.system.dir in HDFS mapred:hadoop
yarn.nodemanager.local-dirs yarn:yarn
yarn.nodemanager.log-dirs yarn:yarn
oozie.service.StoreService.jdbc.url (if using Derby) oozie:oozie
[[database]] name hue:hue
javax.jdo.option.ConnectionURL hue:hue

Existing CDH Cluster Installed Before Cloudera Manager Installation

If you have been using HDFS and running MapReduce jobs in an existing installation of CDH before Cloudera Manager was installed, you must manually configure the directory ownership as shown in the table above to enable the Hadoop daemons to set appropriate permissions on each directory. Configure directory user:group ownership exactly as shown in the table.