1.1. Category I - HDP Essential Components Properties

This category exposes the configuration properties for the essential components (HDFS, MapReduce, HBase, HCatalog, Pig, Hive, Zookeeper, and WebHCat) in HDP. These properties are organized into the following groups:

  Generic Properties: The following table provides detailed information on the general cluster related properties:

Table 5.1. Generic Properties
Property Name Notes Mandatory/Optional/Conditional
deployuser Responsible for executing the HDP Installer. This user must be created on all the nodes in your cluster. Required only for tarball based installations.
installdir Full path to the installation directory for HDP (Example: /hdp).
java64home Location of JAVA_HOME for 64-bit JDK v 1.6 update 31 in your environment. Mandatory
package RPM packages for Red Hat compatible based systems. (Default: rpm) Mandatory
security If set to yes, installs secure Hadoop cluster. (Default: no) (Allowed: yes/no) Conditional
sshkey Either provide full path to the sshkey which allows you to perform passwordless SSH OR Set this field to empty when passwordless SSH is set-up. Mandatory. Required when passwordless SSH is not setup. The SSH key must be passwordless SSH key.
smoke_test_user User responsible for executing the smoke tests. (Default:hdptestuser) Mandatory. Ensure that this user is created on all nodes in your cluster with primary group hadoop.

 Hadoop Core Properties: The following table provides information on the properties required for core Hadoop components (HDFS and MapReduce):

Table 5.2. Hadoop Core Properties
Property Name Notes Mandatory/Optional/Conditional
enableappend Enable this property even if installhbase is set to no. (Default: true) (Allowed: true/false) Conditional. Required only when installhbase is set to yes.
enablewebhdfs Enable this property to true only if security is set to yes. (Default: false) (Allowed: true/false) Conditional. Required only when security is set to yes.
taskscheduler Scheduler to be used for job scheduling. (Default: org.apache.hadoop.mapred.CapacityTask­Scheduler) (Allowed: org.apache.hadoop.mapred.JobQueueTaskScheduler/org.apache.hadoop.mapred.CapacityTaskScheduler) Optional
enablelzo Enable LZO compression. (Default: no) (Allowed: yes/no) Optional. Required for compressing MapReduce jobs.
enableshortcircuit Enable short circuit read. (Default: true) (Allowed: true/false) Conditional Required only when installhbase is set to yes.

 Service User Properties: The following table lists the properties for service users:

[Note]Note

For information on other service users, see (see: Hadoop Service Accounts).

Table 5.3. Service User Properties
Service User Name Notes Mandatory/Optional/Conditional
smoke_test_user User responsible for executing the smoke tests. (Default: hdptestuser:hadoop) Mandatory.

 Data and Log Directory Configurations: The following properties determine the default locations for the HDFS data directories and log directories for all the components in the HDP stack:

[Note]Note

It is strongly recommended that you assign separate disks for individual data directories.

Table 5.4. Data and Log Directory Configurations
Property Name Notes Mandatory/Optional/Conditional
datanode_dir Comma-separated list of full path to the Hadoop DataNode directories. (Example: /hdp/1/hadoop/hdfs/data,/hdp/2/hadoop/hdfs/data) Mandatory
namenode_dir Comma-separated list of full path to the Hadoop NameNode directories. (Example: /hdp/1/hadoop/hdfs/namenode,/hdp/2/hadoop/hdfs/namenode)
mapred_dir Comma-separated list of full path to the Hadoop MapReduce directories. (Example: /hdp/1/hadoop/mapred,/hdp/2/hadoop/mapred)
log_dir Full path to Hadoop log directory. (Example: /var/log/hadoop)
pid_dir Full path to Hadoop PID directory. (Example: /var/run/hadoop)
hbase_log_dir Full path to HBase log directory. (Example:/var/log/hbase) Conditional. Required only if installhbase is set to true.
hbase_pid_dir Full path to HBase PID directory. (Example:/var/run/hbase)
hive_log_dir Full path to Hive log directory. (Example:/var/log/hive) Conditional. Required only if installhive is set to true.
zk_log_dir Full path to ZooKeeper log directory. (Example:/var/log/zookeeper) Conditional. Required only if installzookeeper is set to true.
zk_pid_dir Full path to ZooKeeper PID directory. (Example:/var/run/zookeeper)
zk_data_dir Full path to ZooKeeper data directory. (Example: /hdp/1/hadoop/zookeeper)
webhcat_log_dir Full path to log directory for WebHCat. (Example: /var/log/webhcat) Conditional. Required only if installwebhcat is set to true.
webhcat_pid_dir Full path to PID directory for WebHCat. (Example: /var/run/webhcat)

 HDP Stack Components Properties: All the properties listed below are Optional.

Table 5.5. HDP Stack Components Properties
Property Name Notes
installpig Enter yes to install Pig and no otherwise. (Default: yes) (Allowed: yes/no)
installhbase Enter yes to install HBase and no otherwise. (Default: yes) (Allowed: yes/no)
installhive Enter yes to install Hive and no otherwise. Ensure that you have installed MySQL server instance. (Default: yes) (Allowed: yes/no)
installwebhcat Enter yes to install WebHCat and no otherwise. You must also set installhbase to yes. (Default: yes) (Allowed: yes/no)
mysqldbhost Hostname and database name for the MySQL server. Required only if installhive is set to yes.
databasename
mysqldbuser MySQL credentials to connect to the MySQL database specified in databasename property. Ensure that the this user has been granted ALL privileges on the database. Required only if installhive is set to yes.
mysqldbpasswd
tickTime ZooKeeper uses this time unit to regulate heartbeats and timeouts. For example, if the tickTime is set to 2000, the minimum session timeout will be two ticks. (Default: 2000 milliseconds). Required only if installzookeeper is set to yes. Note, you must also set installhbase to yes.
initLimit

Amount of time (in ticks) to allow followers to connect and sync to a leader. Increase this value only if ZooKeeper manages large amount of data in your cluster. (Default: 10). Required only if installzookeeper is set to yes. Note, you must also set installhbase to yes.

syncLimit Amount of time (in ticks) to allow followers to sync with ZooKeeper. All those followers that fall too far behind a leade will be dropped. (Default: 5). Required only if installzookeeper is set to yes. Note, you must also set installhbase to yes.
clientPort Default port used for listening to the client connections. (Default: 2181). Required only if installzookeeper is set to yes. Note, you must also set installhbase to yes.

 Secure Hadoop Deployment Properties: All the properties listed below are Mandatory.

Table 5.6. Secure Hadoop Deployment Properties
Property Name Notes
security Required to deploy secure Hadoop cluster. (Default: no) (Allowed: yes/no)
kinitpath Full path to kinit executable. (Default: /usr/kerberos/bin/kinit) Ensure that you provide the correct value for this property.
keytabdir Full path to service keytab files. These files are stored for all services (NameNode, DataNode, JobTracker, TaskTracker, Hive Metastore, HBase Master, and RegionServer) (Default:/etc/security/keytabs)
realm Ensure that you replace the default value (EXAMPLE.COM) with correct realm. Use the krb5.conf file in your Kerberos Key Distribution Center to determine correct value for this property.
hdfs_user_keytab Full path to the keytab file for hdfsuser service user. (Default: /homes/hdfs/hdfs.headless.keytab

smoke_test_user_keytab

Full path to the keytab file for smoke_test_user. (Default: /homes/$smoke_test_user/$smoke_test_user.headless.keytab)


loading table of contents...