1. Meet Minimum System Requirements

To run the Hortonworks Data Platform, your system must meet minimum requirements.

 1.1. Hardware recommendations

Although there is no single hardware requirement for installing HDP, there are some basic guidelines. A complete installation of HDP 2.2 will take up about 2.5 GB of disk space. For more information about HDP hardware recommendations, see the "HDP Cluster Planning Guide."

 1.2. Operating System Requirements

The following operating systems are supported:

  • 64-bit CentOS 6

  • 64-bit CentOS 5 (Deprecated)

  • 64-bit Red Hat Enterprise Linux (RHEL) 6

  • 64-bit Red Hat Enterprise Linux (RHEL) 5 (Deprecated)

  • 64-bit Oracle Linux 6

  • 64-bit Oracle Linux 5 (Deprecated)

  • 64-bit SUSE Linux Enterprise Server (SLES) 11, SP1, SP3, and SP4

  • 64-bit Debian 6

  • 64-bit Ubuntu Precise (12.04)

  • Windows Server 2008, 2012

 1.3. Software Requirements

Install the following software on each of your hosts.

  • yum (for RHEL or CentOS)

  • zypper (for SLES)

  • php_curl (for SLES)

  • reposync (may not be installed by default on all SLES hosts)

  • apt-get (for Ubuntu)

  • rpm

  • scp

  • curl

  • wget

  • unzip

  • chkconfig (Ubuntu and Debian)

  • tar

 1.4. JDK Requirements

Your system must have the correct JDK installed on all cluster nodes. HDP supports the following JDKs.

  • Oracle JDK 1.7 64-bit update 51 or higher

  • OpenJDK 7 64-bit

  • Oracle JDK 1.6 update 31 64-bit (Deprecated)

  • Metastore Database Requirements

    If you are installing Hive and HCatalog or installing Oozie, you must install a database to store metadata information in the metastore. You can either use an existing database instance or install a new instance manually. HDP supports the following databases for the metastore:

    • Postgres 8.x, 9.3+

    • MySQL 5.6

    • Oracle 11g r2

    • SQL Server 2008 R2+

The following sections describe how to install and configure the JDK.

 1.4.1. Oracle JDK 1.7

Use the following instructions to manually install JDK 7:

  1. Verify that you have a /usr/java directory. If not, create one:

    mkdir /usr/java

  2. Download the Oracle 64-bit JDK (jdk-7u67-linux-x64.tar.gz) from the Oracle download site. Open a web browser and navigate to http://www.oracle.com/technetwork/java/javase/downloads/java-archive-downloads-javase7-521261.html.

  3. Copy the downloaded jdk-7u67-linux-x64.gz file to the /usr/java directory.

  4. Navigate to the /usr/java folder and extract the jdk-7u67-linux-x64.gz file.

    cd /usr/java 
    tar zxvf jdk-7u67-linux-x64.gz

    The JDK files will be extracted into a /usr/java/jdk1.7.0_67 directory.

  5. Create a symbolic link (symlink) to the JDK:

    ln -s /usr/java/jdk1.7.0_67 /usr/java/default

  6. Set the JAVA_HOME and PATH environment variables.

    export JAVA_HOME=/usr/java/default 
    export PATH=$JAVA_HOME/bin:$PATH
  7. Verify that Java is installed in your environment by running the following command:

    java -version

    You should see output similar to the following:

    java version "1.7.0_67"
    Java(TM) SE Runtime Environment (build 1.7.0_67-b01)
    Java HotSpot(TM) 64-Bit Server VM (build 24.67-b01, mixed mode) 

 1.4.2. Oracle JDK 1.6 (Deprecated)

Oracle JDK 1.6 is considered deprecated as of HDP 2.2 and will be removed in a future release. Use the following instructions to manually install JDK 1.6 update 31:

  1. Check the version. From a terminal window, type:

    java -version

  2. Optional - Uninstall the Java package if the JDK version is less than v1.6 update 31.

    rpm -qa | grep java yum remove {java-1.*}

  3. Optional - Verify that the default Java package is uninstalled.

    which java

  4. Download the Oracle 64-bit JDK (jdk-6u31-linux-x64.bin) from the Oracle download site. Open a web browser and navigate to http://www.oracle.com/technetwork/java/javase/downloads/java-archive-downloads-javase6-419409.html. Accept the license agreement and download jdk-6u31-linux-x64.bin to a temporary directory ($JDK_download_directory).

  5. Change directory to the location where you downloaded the JDK and run the install.

    mkdir /usr/jdk1.6.0_31 cd /usr/jdk1.6.0_31chmod u+x $JDK_download_directory/jdk-6u31-linux-x64.bin./$JDK_download_directory/jdk-6u31-linux-x64.bin

  6. Create symbolic links (symlinks) to the JDK.

    mkdir /usr/javaln -s /usr/jdk1.6.0_31/jdk1.6.0_31 /usr/java/default ln -s /usr/java/default/bin/java /usr/bin/java

  7. Set up your environment to define JAVA_HOME to put the Java Virtual Machine and the Java compiler on your path.

    export JAVA_HOME=/usr/java/default export PATH=$JAVA_HOME/bin:$PATH

  8. Verify if Java is installed in your environment. Execute the following from the command line console:

    java -version

    You should see the following output:

    java version "1.6.0_31"
    Java(TM) SE Runtime Environment (build 1.6.0_31-b04)
    Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode) 

 1.4.3. OpenJDK 7

OpenJDK7 on HDP 2.2 does not work if you are using SLES as your OS. Use the following instructions to manually install OpenJDK 7:

  1. Check the version. From a terminal window, type:

    java -version

  2. (Optional) Uninstall the Java package if the JDK version is less than 7. For example, if you are using Centos:

    rpm -qa | grep java yum remove {java-1.*}

  3. (Optional) Verify that the default Java package is uninstalled.

    which java

  4. (Optional) Download OpenJDK 7 RPMs. From the command-line, run:

    RedHat/CentOS/Oracle Linux:

    yum install java-1.7.0-openjdk java-1.7.0-openjdk-devel

    SUSE:

    zypper install java-1.7.0-openjdk java-1.7.0-openjdk-devel

    Ubuntu/Debian:

    apt-get install openjdk-7-jdk

  5. (Optional) Create symbolic links (symlinks) to the JDK.

    mkdir /usr/java ln -s /usr/hdp/current/jvm/java-1.7.0-openjdk-1.7.0.51.x86_64 /usr/java/default

  6. (Optional) Set up your environment to define JAVA_HOME to put the Java Virtual Machine and the Java compiler on your path.

    export JAVA_HOME=/usr/java/default 
    export PATH=$JAVA_HOME/bin:$PATH
  7. (Optional) Verify if Java is installed in your environment. Execute the following from the command-line console:

    java -version

    You should see output similar to the following:

    openjdk version "1.7.0"
    OpenJDK Runtime Environment (build 1.7.0)
    OpenJDK Client VM (build 20.6-b01, mixed mode)

 1.5. Metastore Database Requirements

If you are installing Hive and HCatalog or installing Oozie, you must install a database to store metadata information in the metastore. You can either use an existing database instance or install a new instance manually. HDP supports the following databases for the metastore:

  • Postgres 8.x, 9.3+

  • MySQL 5.6

  • Oracle 11g r2

  • SQL Server 2008 R2+

The following sections describe how to install and configure the Metastore database.

 1.5.1. Metastore Database Prerequisites

The database administrator must create the following users and specify the following values.

  • For Hive: hive_dbname, hive_dbuser, and hive_dbpasswd.

  • For Oozie: oozie_dbname, oozie_dbuser, and oozie_dbpasswd.

    [Note]Note

    By default, Hive uses the Derby database for the metastore. However, Derby is not supported for production systems.

 1.5.2. Installing and Configuring PostgreSQL

The following instructions explain how to install PostgreSQL as the metastore database. See your third-party documentation for instructions on how to install other supported databases.

 1.5.2.1. RHEL/CentOS/Oracle Linux

To install a new instance of PostgreSQL:

  1. Connect to the host machine where you plan to deploy PostgreSQL instance.

    At a terminal window, enter:

    yum install postgresql-server

  2. Start the instance.

    /etc/init.d/postgresql start

    [Note]Note

    For some newer versions of PostgreSQL, you might need to execute the command: /etc/init.d/postgresql initdb

  3. Reconfigure PostgreSQL server:

    • Edit the /var/lib/pgsql/data/postgresql.conf file.

      Change the value of #listen_addresses = 'localhost' to listen_addresses = '*'

    • Edit the /var/lib/pgsql/data/postgresql.conf file.

      Change the port setting number from #port = 5432 to port = 5432

    • Edit the /var/lib/pgsql/data/pg_hba.conf

      Add the following:

      host all all 0.0.0.0/0 trust

    • Optional: If you are using PostgreSQL v9.1 or later, add the following to the /var/lib/pgsql/data/postgresql.conf file:

      standard_conforming_strings = off

  4. Create users for PostgreSQL server.

    Logged in as the postgres user, enter:

    echo "CREATE DATABASE $dbname;" | sudo -u $postgres psql -U postgres
    echo "CREATE USER $user WITH PASSWORD '$passwd';" | sudo -u $postgres psql -U postgres
    echo "GRANT ALL PRIVILEGES ON DATABASE $dbname TO $user;" | sudo -u $postgres psql -U postgres

    Where:

    $postgres is the postgres user, $user is the user you want to create, and $dbname is the name of your PostgreSQL database.

    [Note]Note

    For access to the Hive metastore, create hive_dbuser after Hive has been installed, and for access to the Oozie metastore, create oozie_dbuser after Oozie has been installed.

  5. On the Hive Metastore host, install the connector:

    yum install postgresql-jdbc*

  6. Confirm that the .jar is in the Java share directory.

    ls /usr/share/java/postgresql-jdbc.jar

 1.5.2.2. SUSE Linux Enterprise Server (SLES)

To install a new instance of PostgreSQL:

  1. Connect to the host machine where you plan to deploy the PostgreSQL instance.

    At a terminal window, enter:

    zypper install postgresql-server

  2. Start the instance.

    /etc/init.d/postgresql start

    [Note]Note

    For some newer versions of PostgreSQL, you might need to execute the command:

    /etc/init.d/postgresql initdb

  3. Reconfigure the PostgreSQL server:

    • Edit the /var/lib/pgsql/data/postgresql.conf file.

      Change the value of #listen_addresses = 'localhost' to listen_addresses = '*'

    • Edit the /var/lib/pgsql/data/postgresql.conf file.

      Change the port setting #port = 5432 to port = 5432

    • Edit the /var/lib/pgsql/data/pg_hba.conf

      Add the following:

      host all all 0.0.0.0/0 trust

    • Optional: If you are using PostgreSQL v9.1 or later, add the following to the /var/lib/pgsql/data/postgresql.conf file:

      standard_conforming_strings = off

  4. Create users for PostgreSQL server.

    Logged in as the postgres user, enter:

    echo "CREATE DATABASE $dbname;" | sudo -u $postgres psql -U postgres
    echo "CREATE USER $user WITH PASSWORD '$passwd';" | sudo -u $postgres psql -U postgres
    echo "GRANT ALL PRIVILEGES ON DATABASE $dbname TO $user;" | sudo -u $postgres psql -U postgres

    Where:

    $postgres is the postgres user, $user is the user you want to create, and $dbname is the name of your PostgresSQL database.

    [Note]Note

    For access to the Hive metastore, create hive_dbuser after Hive has been installed, and for access to the Oozie metastore, create oozie_dbuser after Oozie has been installed.

  5. On the Hive Metastore host, install the connector.

    zypper install -y postgresql-jdbc

  6. Copy the connector .jar file to the Java share directory.

    cp /usr/share/pgsql/postgresql-*.jdbc3.jar /usr/share/java/postgresql-jdbc.jar

  7. Confirm that the .jar is in the Java share directory.

    ls /usr/share/java/postgresql-jdbc.jar

  8. Change the access mode of the .jar file to 644.

    chmod 644 /usr/share/java/postgresql-jdbc.jar

 1.5.2.3. Ubuntu/Debian

To install a new instance of PostgreSQL:

  1. Connect to the host machine where you plan to deploy PostgreSQL instance.

    At a terminal window, enter:

    apt-get install postgresql-server

  2. Start the instance.

    [Note]Note

    For some newer versions of PostgreSQL, you might need to execute the command:

    /etc/init.d/postgresql initdb

  3. Reconfigure PostgreSQL server:

    • Edit the /var/lib/pgsql/data/postgresql.conf file.

      Change the value of #listen_addresses = 'localhost' to listen_addresses = '*'

    • Edit the /var/lib/pgsql/data/postgresql.conf file.

      Change the port setting from #port = 5432 to port = 5432

    • Edit the /var/lib/pgsql/data/pg_hba.conf

      Add the following:

      host all all 0.0.0.0/0 trust

    • Optional: If you are using PostgreSQL v9.1 or later, add the following to the /var/lib/pgsql/data/postgresql.conf file:

      standard_conforming_strings = off

  4. Create users for PostgreSQL server.

    Logged in as the postgres user, enter:

    echo "CREATE DATABASE $dbname;" | sudo -u $postgres psql -U postgres
    echo "CREATE USER $user WITH PASSWORD '$passwd';" | sudo -u psql -U postgres
    echo "GRANT ALL PRIVILEGES ON DATABASE $dbname TO $user;" | sudo -u psql -U postgres 

    Where:

    $postgres is the postgres user, $user is the user you want to create, and $dbname is the name of your PostgresSQL database.

    [Note]Note

    For access to the Hive metastore, create hive_dbuser after Hive has been installed, and for access to the Oozie metastore, create oozie_dbuser after Oozie has been installed.

  5. On the Hive Metastore host, install the connector.

  6. Copy the connector .jar file to the Java share directory.

    cp /usr/share/pgsql/postgresql-*.jdbc3.jar /usr/share/java/postgresql-jdbc.jar

  7. Confirm that the .jar is in the Java share directory.

    ls /usr/share/java/postgresql-jdbc.jar

  8. Change the access mode of the .jar file to 644.

    chmod 644 /usr/share/java/postgresql-jdbc.jar

 1.5.3. Installing and Configuring MySQL

This section describes how to install MySQL as the metastore database. For instructions on how to install other supported databases, see your third-party documentation.

[Important]Important

When you use MySQL as your Hive metastore, you must use mysql-connector-java-5.1.35.zip or later JDBC driver.

 1.5.3.1. RHEL/CentOS

To install a new instance of MySQL:

  1. Connect to the host machine you plan to use for Hive and HCatalog.

  2. Install MySQL server.

    From a terminal window, enter:

    yum install mysql-server

  3. Start the instance.

    /etc/init.d/mysqld start

  4. Set the root user password using the following command format:

    mysqladmin -u root password $mysqlpassword

    For example, to set the password to "root":

    mysqladmin -u root password root

  5. Remove unnecessary information from log and STDOUT.

    mysqladmin -u root 2>&1 >/dev/null

  6. Log in to MySQL as the root user:

    mysql -u root -proot

  7. Logged in as the root user, create the “dbuser” and grant it adequate privileges.

    This user provides access to the Hive metastore. Use the following series of commands (shown here with the returned responses) to create dbuser with password dbuser.

    [root@c6402 /]# mysql -u root -proot
    
    Welcome to the MySQL monitor. Commands end with ; or \g.
    Your MySQL connection id is 11
    Server version: 5.1.73 Source distribution
    
    Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.
    
    Oracle is a registered trademark of Oracle Corporation and/or its
    affiliates. Other names may be trademarks of their respective
    owners.
    
    Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
    
    mysql> CREATE USER 'dbuser'@'localhost' IDENTIFIED BY 'dbuser';
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> GRANT ALL PRIVILEGES ON *.* TO 'dbuser'@'localhost';
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> CREATE USER 'dbuser'@'%' IDENTIFIED BY 'dbuser';
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> GRANT ALL PRIVILEGES ON *.* TO 'dbuser'@'%';
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> FLUSH PRIVILEGES;
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> GRANT ALL PRIVILEGES ON *.* TO 'dbuser'@'localhost' WITH GRANT OPTION;
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> GRANT ALL PRIVILEGES ON *.* TO 'dbuser'@'%' WITH GRANT OPTION;
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> 
  8. Use the exit command to exit MySQL.

  9. You should now be able to reconnect to the database as "dbuser" using the following command:

    mysql -u dbuser -pdbuser

    After testing the dbuser login, use the exit command to exit MySQL.

  10. Install the MySQL connector JAR file.

    yum install mysql-connector-java*

 1.5.3.2. SUSE Linux Enterprise Server (SLES)

To install a new instance of MySQL:

  1. Connect to the host machine you plan to use for Hive and HCatalog.

  2. Install MySQL server.

    From a terminal window, enter:

    zypper install mysql-server

  3. Start the instance.

    /etc/init.d/mysqld start

  4. Set the root user password using the following command format:

    mysqladmin -u root password $mysqlpassword

    For example, to set the password to "root":

    mysqladmin -u root password root

  5. Remove unnecessary information from log and STDOUT.

    mysqladmin -u root 2>&1 >/dev/null

  6. Log in to MySQL as the root user:

    mysql -u root -proot

  7. Logged in as the root user, create dbuser and grant it adequate privileges.

    This user provides access to the Hive metastore. Use the following series of commands (shown here with the returned responses) to create dbuser with password dbuser.

    [root@c6402 /]# mysql -u root -proot
    
    Welcome to the MySQL monitor. Commands end with ; or \g.
    Your MySQL connection id is 11
    Server version: 5.1.73 Source distribution
    
    Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.
    
    Oracle is a registered trademark of Oracle Corporation and/or its
    affiliates. Other names may be trademarks of their respective
    owners.
    
    Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
    
    mysql> CREATE USER 'dbuser'@'localhost' IDENTIFIED BY 'dbuser';
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> GRANT ALL PRIVILEGES ON *.* TO 'dbuser'@'localhost';
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> CREATE USER 'dbuser'@'%' IDENTIFIED BY 'dbuser';
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> GRANT ALL PRIVILEGES ON *.* TO 'dbuser'@'%';
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> FLUSH PRIVILEGES;
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> GRANT ALL PRIVILEGES ON *.* TO 'dbuser'@'localhost' WITH GRANT OPTION;
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> GRANT ALL PRIVILEGES ON *.* TO 'dbuser'@'%' WITH GRANT OPTION;
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> 
  8. Use the exit command to exit MySQL.

  9. You should now be able to reconnect to the database as dbuser, using the following command:

    mysql -u dbuser -pdbuser

    After testing the dbuser login, use the exit command to exit MySQL.

  10. Install the MySQL connector JAR file.

    zypper install mysql-connector-java*

 1.5.3.3. Ubuntu/Debian

To install a new instance of MySQL:

  1. Connect to the host machine you plan to use for Hive and HCatalog.

  2. Install MySQL server.

    From a terminal window, enter:

    apt-get install mysql-server

  3. Start the instance.

    /etc/init.d/mysql start

  4. Set the root user password using the following command format:

    mysqladmin -u root password $mysqlpassword

    For example, to set the password to "root":

    mysqladmin -u root password root

  5. Remove unnecessary information from log and STDOUT.

    mysqladmin -u root 2>&1 >/dev/null

  6. Log in to MySQL as the root user:

    mysql -u root -proot

  7. Logged in as the root user, create the dbuser and grant it adequate privileges.

    This user provides access to the Hive metastore. Use the following series of commands (shown here with the returned responses) to create dbuser with password dbuser.

    [root@c6402 /]# mysql -u root -proot
    
    Welcome to the MySQL monitor. Commands end with ; or \g.
    Your MySQL connection id is 11
    Server version: 5.1.73 Source distribution
    
    Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.
    
    Oracle is a registered trademark of Oracle Corporation and/or its
    affiliates. Other names may be trademarks of their respective
    owners.
    
    Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
    
    mysql> CREATE USER 'dbuser'@'localhost' IDENTIFIED BY 'dbuser';
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> GRANT ALL PRIVILEGES ON *.* TO 'dbuser'@'localhost';
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> CREATE USER 'dbuser'@'%' IDENTIFIED BY 'dbuser';
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> GRANT ALL PRIVILEGES ON *.* TO 'dbuser'@'%';
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> FLUSH PRIVILEGES;
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> GRANT ALL PRIVILEGES ON *.* TO 'dbuser'@'localhost' WITH GRANT OPTION;
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> GRANT ALL PRIVILEGES ON *.* TO 'dbuser'@'%' WITH GRANT OPTION;
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> 
  8. Use the exit command to exit MySQL.

  9. You should now be able to reconnect to the database as dbuser, using the following command:

    mysql -u dbuser -pdbuser

    After testing the dbuser login, use the exit command to exit MySQL.

  10. Install the MySQL connector JAR file.

    apt-get install mysql-connector-java*

 1.5.4. Configuring Oracle as the Metastore Database

You can select Oracle as the metastore database. For instructions on how to install the databases, see your third-party documentation. To configure Oracle as the Hive Metastore, install HDP and Hive, then follow the instructions in "Set up Oracle DB for use with Hive Metastore" in this guide.


loading table of contents...