This category exposes the configuration properties for the essential components (HDFS, MapReduce, HBase, HCatalog, Pig, Hive, Zookeeper, and WebHCat) in HDP. These properties are organized into the following groups:
Generic Properties: The following table provides detailed information on the general cluster related properties:
Property Name | Notes | Mandatory/Optional/Conditional |
deployuser | Responsible for executing the HDP Installer. This user must be created on all the nodes in your cluster. | Required only for tarball based installations. |
installdir | Full path to the installation directory for HDP (Example: /hdp). | |
java64home | Location of JAVA_HOME for 64-bit JDK v 1.6 update 31 in your environment. | Mandatory |
package | RPM packages for Red Hat compatible based systems. (Default: rpm) | Mandatory |
security | If set to yes, installs secure Hadoop cluster. (Default: no) (Allowed: yes/no) | Conditional |
sshkey | Either provide full path to the sshkey which allows you to perform passwordless SSH OR Set this field to empty when passwordless SSH is set-up. | Mandatory. Required when passwordless SSH is not setup. The SSH key must be passwordless SSH key. |
smoke_test_user | User responsible for executing the smoke tests. (Default:hdptestuser) | Mandatory. Ensure that this user is created on all nodes in your cluster with primary group hadoop . |
Hadoop Core Properties: The following table provides information on the properties required for core Hadoop components (HDFS and MapReduce):
Property Name | Notes | Mandatory/Optional/Conditional |
enableappend | Enable this property even if installhbase is set to no.
(Default: true)
(Allowed: true/false) |
Conditional.
Required only when installhbase is set to yes. |
enablewebhdfs | Enable this property to true only if security is set to yes.
(Default: false)
(Allowed: true/false) |
Conditional.
Required only when security is set to yes. |
taskscheduler | Scheduler to be used for job scheduling. (Default: org.apache.hadoop.mapred.CapacityTaskScheduler) (Allowed: org.apache.hadoop.mapred.JobQueueTaskScheduler/org.apache.hadoop.mapred.CapacityTaskScheduler) | Optional |
enablelzo | Enable LZO compression. (Default: no) (Allowed: yes/no) | Optional. Required for compressing MapReduce jobs. |
enableshortcircuit | Enable short circuit read. (Default: true) (Allowed: true/false) | Conditional
Required only when installhbase is set to yes. |
Service User Properties: The following table lists the properties for service users:
Note | |
---|---|
For information on other service users, see (see: Hadoop Service Accounts). |
Service User Name | Notes | Mandatory/Optional/Conditional |
smoke_test_user | User responsible for executing the smoke tests. (Default: hdptestuser:hadoop) | Mandatory. |
Data and Log Directory Configurations: The following properties determine the default locations for the HDFS data directories and log directories for all the components in the HDP stack:
Note | |
---|---|
It is strongly recommended that you assign separate disks for individual data directories. |
Property Name | Notes | Mandatory/Optional/Conditional |
datanode_dir | Comma-separated list of full path to the Hadoop DataNode directories. (Example: /hdp/1/hadoop/hdfs/data,/hdp/2/hadoop/hdfs/data) | Mandatory |
namenode_dir | Comma-separated list of full path to the Hadoop NameNode directories. (Example: /hdp/1/hadoop/hdfs/namenode,/hdp/2/hadoop/hdfs/namenode) | |
mapred_dir | Comma-separated list of full path to the Hadoop MapReduce directories. (Example: /hdp/1/hadoop/mapred,/hdp/2/hadoop/mapred) | |
log_dir | Full path to Hadoop log directory. (Example: /var/log/hadoop) | |
pid_dir | Full path to Hadoop PID directory. (Example: /var/run/hadoop) | |
hbase_log_dir | Full path to HBase log directory. (Example:/var/log/hbase) | Conditional. Required only if installhbase
is set to true. |
hbase_pid_dir | Full path to HBase PID directory. (Example:/var/run/hbase) | |
hive_log_dir | Full path to Hive log directory. (Example:/var/log/hive) | Conditional.
Required only if installhive is set to true. |
zk_log_dir | Full path to ZooKeeper log directory. (Example:/var/log/zookeeper) | Conditional. Required only if
installzookeeper is set to true. |
zk_pid_dir | Full path to ZooKeeper PID directory. (Example:/var/run/zookeeper) | |
zk_data_dir | Full path to ZooKeeper data directory. (Example: /hdp/1/hadoop/zookeeper) | |
webhcat_log_dir | Full path to log directory for WebHCat. (Example: /var/log/webhcat) | Conditional. Required only if
installwebhcat is set to true. |
webhcat_pid_dir | Full path to PID directory for WebHCat. (Example: /var/run/webhcat) |
HDP Stack Components Properties: All the properties listed below are Optional.
Property Name | Notes |
installpig | Enter yes to install Pig and no otherwise. (Default: yes) (Allowed: yes/no) |
installhbase | Enter yes to install HBase and no otherwise. (Default: yes) (Allowed: yes/no) |
installhive | Enter yes to install Hive and no otherwise. Ensure that you have installed MySQL server instance. (Default: yes) (Allowed: yes/no) |
installwebhcat | Enter yes to install WebHCat and no otherwise.
You must also set installhbase to yes .
(Default: yes)
(Allowed: yes/no) |
mysqldbhost | Hostname and database name for the MySQL server. Required only
if installhive is set to yes . |
databasename | |
mysqldbuser | MySQL credentials to connect to the MySQL database specified in
databasename property. Ensure that the this user
has been granted ALL privileges on the database. Required only if
installhive is set to yes . |
mysqldbpasswd | |
tickTime | ZooKeeper uses this time unit to regulate heartbeats and timeouts. For
example, if the tickTime is set to 2000, the minimum session timeout
will be two ticks. (Default: 2000 milliseconds). Required only if
installzookeeper is set to yes . Note,
you must also set installhbase to
yes . |
initLimit |
Amount of time (in ticks) to allow followers to connect and
sync to a leader. Increase this value only if ZooKeeper manages
large amount of data in your cluster. (Default: 10). Required
only if |
syncLimit | Amount of time (in ticks) to allow followers to sync with ZooKeeper.
All those followers that fall too far behind a leade will be
dropped. (Default: 5). Required only if
installzookeeper is set to yes . Note,
you must also set installhbase to
yes . |
clientPort | Default port used for listening to the client connections. (Default:
2181). Required only if installzookeeper is set to
yes . Note, you must also set
installhbase to yes . |
Secure Hadoop Deployment Properties: All the properties listed below are Mandatory.
Property Name | Notes |
security | Required to deploy secure Hadoop cluster. (Default: no) (Allowed: yes/no) |
kinitpath | Full path to kinit executable. (Default: /usr/kerberos/bin/kinit) Ensure that you provide the correct value for this property. |
keytabdir | Full path to service keytab files. These files are stored for all services (NameNode, DataNode, JobTracker, TaskTracker, Hive Metastore, HBase Master, and RegionServer) (Default:/etc/security/keytabs) |
realm | Ensure that you replace the default value (EXAMPLE.COM ) with
correct realm. Use the krb5.conf file in your Kerberos Key Distribution Center to determine correct value for this property. |
hdfs_user_keytab | Full path to the keytab file for hdfsuser service user.
(Default: /homes/hdfs/hdfs.headless.keytab |
smoke_test_user_keytab |
Full path to the keytab file for smoke_test_user . (Default:
/homes/$smoke_test_user /$smoke_test_user .headless.keytab) |