Step 4: Install CDH Packages

Before you install CDH, review Recommended Cluster Hosts and Role Distribution.

Install CDH Packages

  1. Install CDH packages on the appropriate hosts, as follows:
    • RHEL Compatible:
      sudo yum install <package_name> [<package_name>...]
    • SLES:
      sudo zypper install <package_name> [<package_name>...]
    • Ubuntu:
      sudo apt-get install <package_name> [<package_name>...]

    The following table lists the package names for each component.

    Role Package Name
    NameNode hadoop-hdfs-namenode
    Secondary NameNode hadoop-hdfs-secondarynamenode
    DataNode hadoop-hdfs-datanode
    HttpFS hadoop-httpfs
    MapReduce v2 with YARN
    ResourceManager hadoop-yarn-resourcemanager
    NodeManager hadoop-yarn-nodemanager
    JobHistory Server hadoop-mapreduce-historyserver
    Hadoop Clients
    Hadoop Client hadoop-client
    Flume Agent flume-ng-agent
    All HBase Roles hbase
    HiveServer2 hive-server2
    Hive Metastore hive-metastore
    Hive Client hive
    Hive HBase Connector hive-hbase
    HCatalog hive-hcatalog
    WebHCat hive-webhcat-server
    Impala Daemon impala-server
    StateStore impala-state-store
    Catalog Server impala-catalog
    Impala Shell impala-shell
    Hue Server hue
    Kafka Broker kafka-server
    Kudu Master kudu-master
    Tablet Server kudu-tserver
    Kudu Client kudu-client0
    Kudu SDK kudu-client-devel
    Spark Worker spark-core
    Spark History Server spark-history-server
    Spark Python Client spark-python
    Java KeyStore KMS hadoop-kms-server
    Oozie Server oozie
    Oozie Client oozie-client
    Solr Server solr-server
    Solr MapReduce Tools solr-mapreduce
    Lily HBase Indexer hbase-solr-indexer
    Spark Indexer solr-crunch
    Sentry Server sentry
    Sqoop Metastore sqoop-metastore
    Sqoop Client sqoop
    Sqoop 2
    Sqoop 2 Server sqoop2-server
    Sqoop 2 Client sqoop2-client
    ZooKeeper Server zookeeper-server

(Optional) Install LZO

This section explains how to install LZO ( Lempel–Ziv–Oberhumer) compression. For more information, see Choosing and Configuring Data Compression.

  1. Add the repository on each host in the cluster. Follow the instructions for your OS version:
    • RHEL Compatible:
      sudo wget -O /etc/yum.repos.d/<version>/x86_64/gplextras/cloudera-gplextras5.repo

      Replace <version> with your RHEL version: 7, 6, or 5

    • SLES 12:
      sudo zypper addrepo -f<version>/x86_64/gplextras/cloudera-gplextras5.repo

      Replace <version> with your SLES version: 12 or 11

    • Ubuntu:
      sudo wget -O /etc/apt/sources.list.d/<version>/amd64/gplextras/cloudera-gplextras.list

      Replace <version> with your Ubuntu version: xenial, trusty, precise, or lucid

    • Debian:
      sudo wget -O /etc/apt/sources.list.d/<version>/amd64/gplextras/cloudera.list

      Replace <version> with your Debian version: jessie, squeeze, or wheezy

  2. Install the hadoop-lzo package:
    • RHEL compatible:
      sudo yum install hadoop-lzo
    • SLES:
      sudo zypper install hadoop-lzo
    • Ubuntu, Debian:
      sudo apt-get install hadoop-lzo
  3. Continue with installing and deploying CDH. As part of the deployment, you will need to do some additional configuration for LZO, as shown under Configuring LZO.

Set Up a CDH Cluster