Use the following instructions to manually configure the cluster properties file for deploying HDP from the command-line interface or in a script.
Create a file for the cluster properties, or use the sample
clusterproperties.txt
file extracted from the HDP Installation zip file. You'll pass the name of the cluster properties file to themsiexec
call when you install HDP. The following examples use the file nameclusterproperties.txt
.Add the properties to the
clusterproperties.txt
file as described in the table below. As you add properties, keep in mind the following:All properties in the cluster properties file must be separated by a newline character.
Directory paths cannot contain white space characters. (For example,
c:\Program Files\Hadoop
is an invalid directory path for HDP.)Use Fully Qualified Domain Names (FQDN) to specify the network host name for each cluster host.
The FQDN is a DNS name that uniquely identifies the computer on the network. By default, it is a concatenation of the host name, the primary DNS suffix, and a period.
When specifying the host lists in the cluster properties file, if the hosts are multi-homed or have multiple NIC cards, make sure that each name or IP address is the preferred name or IP address by which the hosts can communicate among themselves. In other words, these should be the addresses used internal to the cluster, not those used for addressing cluster nodes from outside the cluster.
To Enable NameNode HA, you must include the HA properties and exclude the
SECONDARY_NAMENODE_HOST
definition.
Table 2.11. Configuration Values for Deploying HDP
Configuration Property Name |
Description |
Example Value |
Mandatory/Optional |
---|---|---|---|
HDP_LOG_DIR |
HDP's operational logs are written to this directory on each cluster host. Ensure that you have sufficient disk space for storing these log files. |
d:\hadoop\logs |
Mandatory |
HDP_DATA_DIR |
HDP data will be stored in this directory on each cluster node. You can add multiple comma-separated data locations for multiple data directories. |
d:\hdp\data |
Mandatory |
HDFS_NAMENODE_ DATA_DIR |
Determines where on the local file system the HDFS name node should store the name table (fsimage). You can add multiple comma-separated data locations for multiple data directories. |
d:\hadoop\data\hdfs\nn,c:\hdpdata,d:\hdpdatann |
Mandatory |
HDFS_DATANODE_ DATA_DIR |
Determines where on the local file system an HDFS data node should store its blocks. You can add multiple comma-separated data locations for multiple data directories. |
d:\hadoop\data\hdfs\dn,c:\hdpdata,d:\hdpdatadn |
Mandatory |
NAMENODE_HOST |
The FQDN for the cluster node that will run the NameNode master service. |
NAMENODE-MASTER.acme.com |
Mandatory |
SECONDARY_NAMENODE_ HOST |
The FQDN for the cluster node that will run the Secondary NameNode master service. |
SECONDARY-NN-MASTER.acme.com |
Mandatory when no HA |
RESOURCEMANAGER_HOST |
The FQDN for the cluster node that will run the YARN Resource Manager master service. |
RESOURCE-MANAGER.acme.com |
Mandatory |
HIVE_SERVER_HOST |
The FQDN for the cluster node that will run the Hive Server master service. |
HIVE-SERVER-MASTER.acme.com |
Mandatory |
OOZIE_SERVER_HOST |
The FQDN for the cluster node that will run the Oozie Server master service. |
OOZIE-SERVER-MASTER.acme.com |
Mandatory |
WEBHCAT_HOST |
The FQDN for the cluster node that will run the WebHCat master service. |
WEBHCAT-MASTER.acme.com |
Mandatory |
FLUME_HOSTS |
A comma-separated list of FQDN for those cluster nodes that will run the Flume service. |
FLUME-SERVICE1.acme.com, FLUME-SERVICE2.acme.com, FLUME-SERVICE3.acme.com |
Mandatory |
HBASE_MASTER |
The FQDN for the cluster node that will run the HBase master. |
HBASE-MASTER.acme.com |
Mandatory |
HBASE_REGIONSERVERS |
A comma-separated list of FQDN for those cluster nodes that will run the HBase Region Server services. |
slave1.acme.com, slave2.acme.com, slave3.acme.com |
Mandatory |
SLAVE_HOSTS |
A comma-separated list of FQDN for those cluster nodes that will run the DataNode and TaskTracker services. |
slave1.acme.com, slave2.acme.com, slave3.acme.com |
Mandatory |
ZOOKEEPER_HOSTS |
A comma-separated list of FQDN for those cluster nodes that will run the ZooKeeper hosts. |
ZOOKEEEPER-HOST.acme.com |
Optional |
FALCON_HOST |
A comma-separated list of FQDN for those cluster nodes that will run the Falcon hosts. |
falcon.acme.com, falcon1.acme.com, falcon2.acme.com |
Optional |
KNOX_HOST |
The FQDN of the Knox Gateway host. |
KNOX-HOST.acme.com |
Optional |
STORM_SUPERVISORS |
A comma-separated list of FQDN for those cluster nodes that will run the Storm Supervisor hosts. |
supervisor.acme.com, supervisor1.acme.com, supervisor2.acme.com |
Optional |
STORM_NIMBUS |
The FQDN of the Storm Nimbus Server. |
STORM-HOST.acme.com |
Optional |
DB_FLAVOR |
Database type for Hive and Oozie metastores (allowed databases are SQL Server and Derby). To use default embedded Derby instance, set the value of this property to derby. To use an existing SQL Server instance as the metastore DB, set the value as mssql. |
mssql or derby |
Mandatory |
DB_PORT |
Port address, required only if you are using SQL Server for Hive and Oozie metastores. |
1433 (default) |
Optional |
DB_HOSTNAME |
FQDN for the node where the metastore database service is installed. If using SQL Server, set the value to your SQL Server host name. If using Derby for Hive metastore, set the value to HIVE_SERVER_HOST. |
sqlserver1.acme.com |
Mandatory |
HIVE_DB_NAME |
Database for Hive metastore. If using SQL Server, ensure that you create the database on the SQL Server instance. |
hivedb |
Mandatory |
HIVE_DB_USERNAME |
User account credentials for Hive metastore database instance. Ensure that this user account has appropriate permissions. |
hive_user |
Mandatory |
HIVE_DB_PASSWORD |
User account credentials for Hive metastore database instance. Ensure that this user account has appropriate permissions. |
hive_pass |
Mandatory |
OOZIE_DB_NAME |
Database for Oozie metastore. If using SQL Server, ensure that you create the database on the SQL Server instance. |
ooziedb |
Mandatory |
OOZIE_DB_USERNAME |
User account credentials for Oozie metastore database instance. Ensure that this user account has appropriate permissions. |
oozie_user |
Mandatory |
OOZIE_DB_PASSWORD |
User account credentials for Oozie metastore database instance. Ensure that this user account has appropriate permissions. |
oozie_pass |
Mandatory |
DEFAULT_FS |
Default file system. |
HDFS |
|
RESOURCEMANAGER_HOST |
Host used for Resource Manager |
|
|
IS_TEZ |
Installs the Tez component on Hive host. |
YES or NO |
Optional |
ENABLE_LZO |
Enables the LZO codec for compression in HBase cells. |
YES or NO |
Optional |
IS_PHOENIX |
Installs Phoenix on the HBase hosts. |
YES or NO |
Optional |
IS_HDFS_HA |
Specify whether to enable High Availability for HDFS |
YES or NO |
Mandatory |
HIVE_DR |
Indicates whether you want to install HiveDR |
YES or NO |
Optional |
Configuration Values: High Availability
To ensure that a multi-node cluster remains available, configure and enable High Availability. Configuring High Availability includes defining locations and names of hosts in a cluster that are available to act as journal nodes and a standby name node in the event that the primary name node fails. To configure High Availability, add the following properties to your cluster properties file, and set their values as follows:
Note | |
---|---|
To enable High Availability, you must also run several HA-specific commands when you start cluster services. |
Table 2.12. High Availability configuration information
Configuration Property Name |
Description |
Example Value |
Mandatory/Optional |
---|---|---|---|
HA |
Whether to deploy a highly available NameNode or not. |
yes or no |
Optional |
NN_HA_JOURNALNODE_ HOSTS |
A comma-separated list of FQDN for those cluster nodes that will run the JournalNode processes. |
journalnode1.acme.com, journalnode2.acme.com, journalnode3.acme.com |
Optional |
NN_HA_CLUSTER_NAME |
This name is used for both configuration and authority component of absolute HDFS paths in the cluster. |
hdp2-ha |
Optional |
NN_HA_JOURNALNODE_ EDITS_DIR |
This is the absolute path on the JournalNode machines where the edits and other local state used by the JournalNodes (JNs) are stored. You can only use a single path for this configuration. |
d:\hadoop\journal |
Optional |
NN_HA_STANDBY_ NAMENODE_HOST |
The host for the standby NameNode. |
STANDBY_NAMENODE.acme.com |
Optional |
RM_HA_CLUSTER_NAME |
A logical name for the Resource Manager cluster. |
HA Resource Manager |
Optional |
RM_HA_STANDBY_ RESOURCEMANAGER_ HOST |
The FQDN of the standby resource manager host. |
rm-standby-host.acme.com |
Optional |
Configuration Values: Ranger
Note | |
---|---|
"Mandatory" means that the property must be specified if Ranger is enabled. |
Table 2.13. Ranger configuration information
Configuration Property Name |
Description |
Example Value |
Mandatory/Optional/Conditional |
---|---|---|---|
RANGER_HOST |
Host name of the host where Ranger-Admin and Ranger-UserSync services will be installed |
WIN-Q0E0PEACTR |
Mandatory |
RANGER_ADMIN_DB_HOST |
MySQL server instance for use by the Ranger Admin database host. (MySQL should be up and running at installation time.) |
localhost |
Mandatory |
RANGER_ADMIN_DB_PORT |
Port number for Ranger-Admin database server |
3306 |
Mandatory |
RANGER_ADMIN_DB_ROOT_ PASSWORD |
Database root password (required for policy/audit database creation) |
adm2 |
Mandatory |
RANGER_ADMIN_DB_ DBNAME |
Ranger-Admin policy database name |
ranger (default) |
Mandatory |
RANGER_ADMIN_DB_ USERNAME |
Ranger-Admin policy database user name |
rangeradmin (default) |
Mandatory |
RANGER_ADMIN_DB_ PASSWORD |
Password for the RANGER_ADMIN_DB_USERNAME user |
RangerAdminPassW0Rd |
Mandatory |
RANGER_AUDIT_DB_HOST |
Host for Ranger Audit database. (MySQL should be up and running at installation time). This can be the same as RANGER_ADMIN_DB_HOST or you can specify a different server. |
localhost |
Mandatory |
RANGER_AUDIT_DB_PORT |
Port number where Ranger-Admin runs audit service |
3306 |
Mandatory |
RANGER_AUDIT_DB_ROOT _PASSWORD |
Database password for the RANGER_AUDIT_DB_USERNAME (required for audit database creation) |
RangerAuditPassW0Rd |
Mandatory |
RANGER_EXTERNAL_URL |
URL used for Ranger |
localhost:8080 |
Optional |
RANGER_AUDIT_DB_ DBNAME |
Ranger audit database name. This can be a different database in the same database server mentioned above. |
ranger_audit (default) |
Mandatory |
RANGER_AUDIT_DB_ USERNAME |
Database user that performs all audit logging operations from Ranger plugins |
rangerlogger (default) |
Mandatory |
RANGER_AUDIT_DB_ PASSWORD |
Database password for the RANGER_AUDIT_DB_USERNAME user |
RangerAuditPassW0Rd |
Mandatory |
RANGER_AUTHENTICA- TION_METHOD |
Authentication Method used to login into the Policy Admin Tool. |
None: allows only users created within Policy Admin Tool (default) LDAP: allows users to be authenticated using Corporate LDAP. AD: allows users to be authenticated using a Active Directory. |
Mandatory |
RANGER_LDAP_URL |
URL for the LDAP service |
ldap://71.127.43.33:386 |
Mandatory if authentication method is LDAP |
RANGER_LDAP_ USERDNPATTERN |
LDAP DN pattern used to locate the login user (uniquely) |
uid={0},ou=users,dc=ranger2, dc=net |
Mandatory if authentication method is LDAP |
RANGER_LDAP_ GROUPSEARCHBASE |
Defines the part of the LDAP directory tree under which group searches should be performed |
ou=groups,dc=ranger2, dc=net |
Mandatory if authentication method is LDAP |
RANGER_LDAP_ GROUPSEARCHFILTER |
LDAP search filter used to retrieve groups for the login user |
(member=uid={0},ou=users, dc=ranger2,dc=net) |
Mandatory if authentication method is LDAP |
RANGER_LDAP_ GROUPROLEATTRIBUTE |
Contains the name of the authority defined by the group entry, used to retrieve the group names from the group search filters |
cn |
Mandatory if authentication method is LDAP |
RANGER_LDAP_AD_ DOMAIN |
Active Directory Domain Name used for AD login |
rangerad.net |
Mandatory if authentication method is Active Directory |
RANGER_LDAP_AD_URL |
Active Directory LDAP URL for authentication of user |
ldap://ad.rangerad.net:389 |
Mandatory if authentication method is Active Directory |
RANGER_POLICY_ADMIN _URL |
URL used within policy admin tool when a link to its own page is generated in the policy admin tool website |
localhost:6080 |
Optional |
RANGER_HDFS_REPO |
The repository name used in Policy Admin Tool for defining policies for HDFS |
hadoopdev |
Mandatory if using Ranger on HDFS |
RANGER_HIVE_REPO |
The repository name used in Policy Admin Tool for defining policies for Hive |
hivedev |
Mandatory if using Ranger on Hive |
RANGER_HBASE_REPO |
The repository name used in Policy Admin Tool for defining policies for HBase |
hbasedev |
Mandatory if using Ranger on HBase |
RANGER_KNOX_REPO |
The repository name used in Policy Admin Tool for defining policies for Knox |
knoxdev |
Mandatory if using Ranger on Knox |
RANGER_STORM_REPO |
The repository name used in Policy Admin Tool for defining policies for Storm |
stormdev |
Mandatory if using Ranger on Storm |
RANGER_SYNC_INTERVAL |
Specifies the interval (in minutes) between synchronization cycles. Note: the second sync cycle will NOT start until the first sync cycle is complete. |
5 |
Mandatory |
RANGER_SYNC_LDAP_URL |
LDAP URL for synchronizing users |
ldap://ldap.example.com:389 |
Mandatory |
RANGER_SYNC_LDAP_ BIND_DN |
LDAP bind DN used to connect to LDAP and query for users and group. This must be a user with admin privileges to search the directory for users/groups. |
cn=admin,ou=users, dc=hadoop,dc=apache, dc-org |
Mandatory |
RANGER_SYNC_LDAP_ BIND_PASSWORD |
Password for the LDAP bind DN |
LdapAdminPassW0Rd |
Mandatory |
RANGER_SYNC_LDAP_ USER_SEARCH_SCOPE |
Scope for user search |
base, one and sub are supported values |
Mandatory |
RANGER_SYNC_LDAP_ USER_OBJECT_CLASS |
Object class to identify user entries |
person (default) |
Mandatory |
RANGER_SYNC_LDAP_ USER_NAME_ATTRIBUTE |
Attribute from user entry that will be treated as user name |
cn (default) |
Mandatory |
RANGER_SYNC_LDAP_ USER_GROUP_NAME _ATTRIBUTE |
Attribute from user entry whose values will be treated as group values to be pushed into the Policy Manager database. |
One or more attribute names separated by commas, such as: memberof,ismemberof |
Mandatory |
RANGER_SYNC_LDAP_ USERNAME_CASE _CONVERSION |
Convert all user names to lowercase or uppercase |
none: no conversion; keep as-is in SYNC_SOURCE. lower: (default) convert to lowercase when saving user names to the Ranger database. upper: convert to uppercase when saving user names to the Ranger db. |
Mandatory |
RANGER_SYNC_LDAP_ GROUPNAME_CASE _CONVERSION |
Convert all group names to lowercase or uppercase |
(same as user name case conversion property) |
Mandatory |
RANGER_SYNC_LDAP_ USER_SEARCH_BASE |
Search base for users |
ou=users,dc=hadoop, dc=apache,dc=org |
Mandatory |
AUTHSERVICEHOSTNAME |
Server Name (or IP address) where Ranger-Usersync module is running (along with Unix Authentication Service) |
localhost (default) |
Mandatory |
AUTHSERVICEPORT |
Port Number where Ranger-Usersync module is running the Unix Authentication Service |
5151 (default) |
Mandatory |
POLICYMGR_HTTP_ENABLED |
Flag to enable/disable HTTP protocol for downloading policies by Ranger plugin modules |
true (default) |
Mandatory |
REMOTELOGINENABLED |
Flag to enable/disable remote Login via Unix Authentication Mode |
true (default) |
Mandatory |
SYNCSOURCE |
Specifies where the user/group information is extracted to be put into ranger database. |
LDAP |
|