Reference Guide
Also available as:
PDF

Option II - Mirror server has temporary or continuous access to the Internet

Complete the following instructions to set up a mirror server that has temporary access to the Internet:

  1. Check Your Prerequisites.

    Select a local mirror server host with the following characteristics:

    • This server runs on either CentOS/RHEL/Oracle Linux 5.x or 6.x, or Ubuntu 12, and has several GB of storage available.

    • The local mirror server and the cluster nodes must have the same OS. If they are not running CentOS or RHEL, the mirror server must not be a member of the Hadoop cluster.

      [Note]Note

      To support repository mirroring for heterogeneous clusters requires a more complex procedure than the one documented here.

      To support repository mirroring for heterogeneous clusters requires a more complex procedure than the one documented here.

    • The firewall allows all cluster nodes (the servers on which you want to install HDP) to access this server.

    • Ensure that the mirror server hasyum installed.

    • Add the yum-utils and createrepo packages on the mirror server.

      yum install yum-utils createrepo
  2. Install the Repos.

    • Temporarily reconfigure your firewall to allow Internet access from your mirror server host.

    • Execute the following command to download the appropriate Hortonworks yum client configuration file and save it in /etc/yum.repos.d/ directory on the mirror server host.

      Table 3.5. Yum Client Configuration Commands

      Cluster OSHDP Repository Tarballs
      RHEL/CentOS/Oracle Linux 6.xwget http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.4.2.0/hdp.repo -O /etc/yum.repos.d/hdp.repo
      RHEL/CentOS/Oracle Linux 7.xwget http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.4.2.0/hdp.repo -O /etc/yum.repos.d/hdp.repo
      SLES 11 SP3/SP4wget http://public-repo-1.hortonworks.com/HDP/suse11sp3/2.x/updates/2.4.2.0/hdp.repo -O /etc/zypp/repos.d/hdp.repo
      Ubuntu 12.04wget http://public-repo-1.hortonworks.com/HDP/ubuntu12/2.x/updates/2.4.2.0/hdp.list -O /etc/apt/sources.list.d/hdp.list
      Ubuntu 14wget http://public-repo-1.hortonworks.com/HDP/ubuntu14/2.x/updates/2.4.2.0/hdp.list -O /etc/apt/sources.list.d/hdp.list
      Debian 6 (Deprecated)wget http://public-repo-1.hortonworks.com/HDP/debian6/2.x/updates/2.4.2.0/hdp.list -O /etc/apt/sources.list.d/hdp.list
      Debian 7wget http://public-repo-1.hortonworks.com/HDP/debian7/2.x/updates/2.4.2.0/hdp.list -O /etc/apt/sources.list.d/hdp.list


    • Create an HTTP server.

      • On the mirror server, install an HTTP server (such as Apache httpd using the instructions provided

      • Activate this web server.

      • Ensure that the firewall settings (if any) allow inbound HTTP access from your cluster nodes to your mirror server.

        [Note]Note

        If you are using EC2, make sure that SELinux is disabled.

      • Optional - If your mirror server uses SLES, modify the default-server.conf file to enable the docs root folder listing.

        sed -e s/Options None/Options Indexes MultiViews/ig /etc/apache2/default-server.conf /tmp/tempfile.tmp
        mv /tmp/tempfile.tmp /etc/apache2/default-server.conf
    • On your mirror server, create a directory for your web server.

      • For example, from a shell window, type:

        • For RHEL/CentOS/Oracle:

          mkdir –p /var/www/html/hdp/
        • For SLES:

          mkdir –p /srv/www/htdocs/rpms
        • For Ubuntu and Debian:

          mkdir –p /var/www/html/hdp/
      • If you are using a symlink, enable the followsymlinks on your web server.

      • Copy the contents of entire HDP repository for your desired OS from the remote yum server to your local mirror server.

        • Continuing the previous example, from a shell window, type:

          • For RHEL/CentOS/Oracle/Ubuntu:

            cd/var/www/html/hdp
          • For SLES:

            cd /srv/www/htdocs/rpms

          Then for all hosts, type:

          • HDP Repository

            reposync -r HDP reposync -r HDP-2.4.2.0 reposync -r HDP-UTILS-1.1.0.20

            You should see both an HDP-2.4.2.0 directory and an HDP-UTILS-1.1.0.20 directory, each with several subdirectories.

      • Generate appropriate metadata.

        This step defines each directory as a yum repository. From a shell window, type:

        • For RHEL/CentOS/Oracle:

          • HDP Repository:

            createrepo /var/www/html/hdp/HDP-2.4.2.0 createrepo /var/www/html/hdp/HDP-UTILS-1.1.0.20
        • For SLES:

          • HDP Repository:

            createrepo /srv/www/htdocs/rpms/hdp/HDP

        You should see a new folder called repodata inside both HDP directories.

      • Verify the configuration.

        • The configuration is successful, if you can access the above directory through your web browser.

          To test this out, browse to the following location:

          • HDP:http://$yourwebserver/hdp/HDP-2.4.2.0/

        • You should now see directory listing for all the HDP components.

      • At this point, you can disable external Internet access for the mirror server, so that the mirror server is again entirely within your data center firewall.

      • Depending on your cluster OS, configure the yum clients on all the nodes in your cluster

        • Edit the repo files, changing the value of the baseurl property to the local mirror URL.

          • Edit the /etc/yum.repos.d/hdp.repo file, changing the value of the baseurl property to point to your local repositories based on your cluster OS.

            [HDP-2.x]
            name=Hortonworks Data Platform Version - HDP-2.x baseurl=http://$yourwebserver/hdp/$os/2.x/GA  
            gpgcheck=1
            gpgkey=http://public-repo-1.hortonworks.com/HDP/$os/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
            enabled=1
            priority=1
            
            [HDP-UTILS-1.1.0.20]
            name=Hortonworks Data Platform Utils Version - HDP-UTILS-1.1.0.20 baseurl=http://$yourwebserver/HDP-UTILS-1.1.0.20/repos/$os 
            gpgcheck=1
            gpgkey=http://public-repo-1.hortonworks.com/HDP/$os/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
            enabled=1
            priority=1
            
            [HDP-2.1.5.0]
            name=Hortonworks Data Platform HDP-2.4.2.0 baseurl=http://$yourwebserver/hdp/$os /2.x/updates/2.4.2.0 
            gpgcheck=1
            gpgkey=http://public-repo-1.hortonworks.com/HDP/$os/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
            enabled=1
            priority=1

            where

            • $yourwebserver is the FQDN of your local mirror server.

            • $os can be centos5, centos6, or suse11. Use the following options table for $os parameter:

              Table 3.6. $OS Parameter Values

              Operating SystemValue
              CentOS 5centos5
              RHEL 5 
              Oracle Linux 5 
              CentOS 6centos6
              RHEL 6 
              Oracle Linux 6 
              SLES 11suse11
              Ubuntu12ubuntu12


        • Copy the yum/zypper client configuration file to all nodes in your cluster.

          • RHEL/CentOS/Oracle Linux:

            Use scp or pdsh to copy the client yum configuration file to /etc/yum.repos.d/ directory on every node in the cluster.

          • For SLES:

            On every node, invoke the following command:

            • HDP Repository:

              zypper addrepo -r http://$yourwebserver/hdp/suse11sp3/2.x/updates/2.4.2.0/hdp.repo
          • For Ubuntu:

            On every node, invoke the following command:

            • HDP Repository:

              sudo add-apt-repository deb http://$yourwebserver/hdp/ubuntu12/2.x/hdp.list
            • Optional - Ambari Repository:

              sudo add-apt-repository deb http://$yourwebserver/hdp/ambari/ubuntu12/1.x/updates/1.7.0/ambari.list
            • If using Ambari, verify the configuration by deploying Ambari server on one of the cluster nodes.

              yum install ambari-server
      • If your cluster runs CentOS, Oracle, or RHEL and if you have multiple repositories configured in your environment, deploy the following plugin on all the nodes in your cluster.

        • Install the plugin.

          • For RHEL and CentOs v5.x

            yum install yum-priorities
          • For RHEL and CentOs v6.x

            yum install yum-plugin-priorities
        • Edit the /etc/yum/pluginconf.d/priorities.conf file to add the following:

          [main]
          enabled=1
          gpgcheck=0