Hortonworks Data Platform

Installing HDP Manually

2015-04-13


Contents

1. Getting Ready to Install
1. Meet Minimum System Requirements
1.1. Hardware recommendations
1.2. Operating System Requirements
1.3. Software Requirements
1.4. JDK Requirements
1.5. Metastore Database Requirements
2. Virtualization and Cloud Platforms
3. Configure the Remote Repositories
4. Decide on Deployment Type
5. Collect Information
6. Prepare the Environment
6.1. Enable NTP on the Cluster
6.2. Check DNS
6.3. Disable SELinux
6.4. Disable IPTables
7. Download Companion Files
8. Define Environment Parameters
9. [Optional] Create System Users and Groups
10. Determine HDP Memory Configuration Settings
10.1. Running the HDP Utility Script
10.2. Manually Calculating YARN and MapReduce Memory Configuration Settings
11. Configuring NameNode Heap Size
12. Allocate Adequate Log Space for HDP
2. Installing HDFS and YARN
1. Set Default File and Directory Permissions
2. Install the Hadoop Packages
3. Install Compression Libraries
3.1. Install Snappy
3.2. Install LZO
4. Create Directories
4.1. Create the NameNode Directories
4.2. Create the SecondaryNameNode Directories
4.3. Create DataNode and YARN NodeManager Local Directories
4.4. Create the Log and PID Directories
4.5. Symlink Directories with hdp-select
3. Installing Apache ZooKeeper
1. Install the ZooKeeper Package
2. Securing ZooKeeper with Kerberos (optional)
3. Set Directories and Permissions
4. Set Up the Configuration Files
5. Start ZooKeeper
4. Setting Up the Hadoop Configuration
5. Validating the Core Hadoop Installation
1. Format and Start HDFS
2. Smoke Test HDFS
3. Configure YARN and MapReduce
4. Start YARN
5. Start MapReduce JobHistory Server
6. Smoke Test MapReduce
6. Installing Apache HBase
1. Install the HBase RPMs
2. Set Directories and Permissions
3. Set Up the Configuration Files
4. Validate the Installation
5. Starting the HBase Thrift and REST APIs
7. Installing Apache Phoenix
1. Configuring HBase for Phoenix
2. Configuring Phoenix to Run in a Secure Cluster
3. Smoke Testing Phoenix
4. Troubleshooting Phoenix
8. Installing and Configuring Apache Tez
1. Prerequisites
2. Install the Tez RPM
3. Configure Tez
4. Validate the Tez Installation
5. Troubleshooting
9. Installing Apache Hive and Apache HCatalog
1. Installing the Hive-HCatalog RPM
2. Setting Directories and Permissions
3. Setting Up the Hive/HCatalog Configuration Files
3.1. HDP-Utility script
3.2. Configure Hive and HiveServer2 for Tez
4. Setting Up RDBMS for Use with Hive Metastore
5. Creating Directories on HDFS
6. Validating the Installation
7. Enabling Tez for Hive Queries
8. Disabling Tez for Hive Queries
9. Configuring Tez with the Capacity Scheduler
10. Validating Hive-on-Tez Installation
10. Installing Apache Pig
1. Install the Pig RPMs
2. Set Up Configuration Files
3. Validate the Installation
11. Installing Apache WebHCat
1. Install the WebHCat RPMs
2. Upload the Pig, Hive and Sqoop tarballs to HDFS
3. Set Directories and Permissions
4. Modify WebHCat Configuration Files
5. Set Up HDFS User and Prepare WebHCat Directories
6. Validate the Installation
12. Installing Apache Oozie
1. Install the Oozie RPMs
2. Set Directories and Permissions
3. Set Up the Oozie Configuration Files
3.1. For Derby:
3.2. For MySQL:
3.3. For PostgreSQL
3.4. For Oracle:
4. Configure Your Database for Oozie
5. Validate the Installation
13. Installing Apache Ranger
1. Installation Prerequisites
2. Manual Installation
3. Installing Policy Manager
3.1. Install the Ranger Policy Manager
3.2. Install the Ranger Policy Administration Service
3.3. Start the Ranger Policy Administration Service
4. Installing UserSync
5. Installing Ranger Plug-ins
5.1. Installing the Ranger HDFS Plug-in
5.2. Installing the Ranger HBase Plug-in
5.3. Installing the Ranger Hive Plug-in
5.4. Installing the Ranger Knox Plug-in
5.5. Installing the Ranger Storm Plug-in
6. Verifying the Installation
14. Installing Hue
1. Prerequisites
2. Configure HDP
3. Install Hue
4. Configure Hue
5. Start Hue
6. Configuring Hue for an External Database
7. Using Hue with Oracle
8. Using Hue with MySQL
9. Using Hue with PostgreSQL
15. Installing Apache Sqoop
1. Install the Sqoop RPMs
2. Set Up the Sqoop Configuration
3. Validate the Installation
16. Installing Apache Mahout
17. Installing and Configuring Apache Flume
1. Understanding Flume
2. Installing Flume
3. Configuring Flume
4. Starting Flume
5. HDP and Flume
6. A Simple Example
18. Installing and Configuring Apache Storm
1. Install the Storm RPMs
2. Configure Storm
3. Configure a Process Controller
4. (Optional) Configure Kerberos Authentication for Storm
5. (Optional) Configuring Authorization for Storm
6. Validate the Installation
19. Installing and Configuring Apache Spark
1. Spark Prerequisites
2. Installing Spark
3. Configuring Spark
4. Validating Spark
20. Installing and Configuring Apache Kafka
1. Install Kafka
2. Configure Kafka
3. Validate Kafka
21. Installing Apache Accumulo
1. Install the Accumulo RPM
2. Configure Accumulo
3. Validate Accumulo
22. Installing Apache Falcon
1. Install the Falcon RPM
2. Configuring Proxy Settings
3. Configuring Falcon Entities
4. Configuring Oozie for Falcon
5. Configuring Hive for Falcon
6. Configuring for Secure Clusters
7. Validate Falcon
23. Installing Apache Knox
1. Install the Knox RPMs on the Knox server
2. Set up and Validate the Knox Gateway Installation
24. Installing Ganglia (Deprecated)
1. Install the Ganglia RPMs
2. Install the Configuration Files
3. Extract the Ganglia Configuration Files
4. Copy the Configuration Files
5. Set Up Ganglia Hosts
6. Set Up Configurations
7. Set Up Hadoop Metrics
8. Validate the Installation
25. Installing Nagios (Deprecated)
1. Install the Nagios RPMs
2. Install the Configuration Files
3. Extract the Nagios Configuration Files
4. Create the Nagios Directories
5. Copy the Configuration Files
6. Set the Nagios Admin Password
7. Set the Nagios Admin Email Contact Address
8. Register the Hadoop Configuration Files
9. Set Hosts
10. Set Host Groups
11. Set Services
12. Set Status
13. Add Templeton Status and Check TCP Wrapper Commands
14. Validate the Installation
26. Installing Apache Slider
27. Setting Up Security for Manual Installs
1. Preparing Kerberos
1.1. Kerberos Overview
1.2. Installing and Configuring the KDC
1.3. Creating the Database and Setting Up the First Administrator
1.4. Creating Service Principals and Keytab Files for HDP
2. Configuring HDP
2.1. Configuration Overview
2.2. Creating Mappings Between Principals and UNIX Usernames
2.3. Examples
2.4. Adding Security Information to Configuration Files
3. Configuring Hue
4. Setting up One-Way Trust with Active Directory
4.1. Configure Kerberos Hadoop Realm on the AD DC
4.2. Configure the AD Domain on the KDC and Hadoop Cluster Hosts
28. Uninstalling HDP

List of Tables

1.1. Define Directories for Core Hadoop
1.2. Define Directories for Ecosystem Components
1.3. Define Users and Groups for Systems
1.4. Typical System Users and Groups
1.5. yarn-utils.py Options
1.6. Reserved Memory Recommendations
1.7. Recommended Values
1.8. YARN and MapReduce Configuration Setting Value Calculations
1.9. Example Value Calculations
1.10. Example Value Calculations
1.11. NameNode Heap Size Settings
8.1. Tez Configuration Parameters
9.1. Hive Configuration Parameters
11.1. Hadoop core-site.xml File Properties
13.1. install.properties Entries
13.2. Properties to Update in the install.properties File
13.3. HDFS-Related Properties to Edit in the install.properties File
13.4. HBase Properties to Edit in the install.properties File
13.5. Hive-Related Properties to Edit in the install.properties File
13.6. Knox-Related Properties to Edit in the install.properties File
13.7. Storm-Related Properties to Edit in the install.properties File
14.1. Hue-Supported Browsers
14.2. Hue Dependencies on HDP Components
14.3. Variables to Configure HDFS Cluster
14.4. Variables to Configure the YARN Cluster
14.5. Beeswax Configuration Values
17.1. Flume 1.5.2 Dependencies
18.1. Required jaas.conf Sections for Cluster Nodes
18.2. Supported Authorizers
18.3. storm.yaml Configuration File Properties
18.4. worker-launcher.cfg File Configuration Properties
18.5. multitenant-scheduler.yaml Configuration File Properties
19.1. Spark Cluster Prerequisites
20.1. Kafka Configuration Properties
25.1. Host Group Parameters
25.2. Core and Monitoring Host Groups
25.3. Ecosystem Project Host Groups
27.1. Service Principals
27.2. Service Keytab File Names
27.3. General core-site.xml, Knox, and Hue
27.4. core-site.xml Master Node Settings -- Knox Gateway
27.5. core-site.xml Master Node Settings -- Hue
27.6. hdfs-site.xml File Property Settings
27.7. yarn-site.xml Property Settings
27.8. mapred-site.xml Property Settings
27.9. hbase-site.xml Property Settings -- HBase Server
27.10. hive-site.xml Property Settings
27.11. oozie-site.xml Property Settings
27.12. webhcat-site.xml Property Settings

loading table of contents...