Creating Virtual Images of Cluster Hosts

You can create virtual machine images, such as PXE-boot images, Amazon AMIs, and Azure VM images of cluster hosts with pre-deployed Cloudera software that you can use to quickly spin up virtual machines. These images use parcels to install CDH software. This topic describes the procedures to create images of the Cloudera Manager host and worker host and how to instantiate hosts from those images.

Creating a Pre-Deployed Cloudera Manager Host

To create a Cloudera Manager virtual machine image:
  1. Instantiate a virtual machine image (an AMI, if you are using Amazon Web Services) based on a supported operating system and start the virtual machine. See the documentation for your virtualization environment for details.
  2. Install Cloudera Manager and configure a database. You can configure either a local or remote database.
  3. Wait for the Cloudera Manager Admin console to become active.
  4. Log in to the Cloudera Manager Admin console.
  5. Download any parcels for CDH or other services managed by Cloudera Manager. Do not distribute or activate the parcels.
  6. Log in to the Cloudera Manager server host:
    1. Run the following command to stop the Cloudera Manager service:
      service cloudera-scm-server stop
    2. Run the following command to disable autostarting of the cloudera-scm-server service:
      • RHEL6.x, CentOS 6.x and SUSE:
        chkconfig cloudera-scm-server off
        
      • RHEL 7.x /CentOS 7.x.x:
        systemctl disable cloudera-scm-server.service
        
      • Ubuntu:
        update-rc.d -f cloudera-scm-server remove
        
  7. Create an image of the Cloudera Manager host.See the documentation for your virtualization environment for details.
  8. If you installed the Cloudera Manager database on a remote host, also create an image of the database host.

Instantiating a Cloudera Manager Image

To create a new Cloudera Manager instance from a virtual machine image:
  1. Instantiate the Cloudera Manager image.
  2. If the Cloudera Manager database will be hosted on a remote host, also instantiate the database host image.
  3. Ensure that the cloudera-scm-server service is not running by running the following command on the Cloudera Manager host:
    service cloudera-scm-server status
    If it is running, stop it using the following command:
    service cloudera-scm-server stop
  4. On the Cloudera Manager host, create a file named uuid in the /etc/cloudera-scm-server directory. Add a globally unique identifier to this file using the following command:
    cat /proc/sys/kernel/random/uuid > /etc/cloudera-scm-server/uuid
    
    The existence of this file informs Cloudera Manager to reinitialize its own unique identifier when it starts.
  5. Run the following command to start the Cloudera Manager service:
    service cloudera-scm-server start
  6. Run the following command to enable automatic restart for the cloudera-scm-server:
    • RHEL6.x, CentOS 6.x and SUSE:
      chkconfig cloudera-scm-server on
      
    • RHEL 7.x /CentOS 7.x.x:
      systemctl enable cloudera-scm-server.service
      
    • Ubuntu:
      update-rc.d -f cloudera-scm-server defaults
      

Creating a Pre-Deployed Worker Host

  1. Instantiate a virtual machine image (an AMI, if you are using Amazon Web Services) based on a supported operating system and start the virtual machine. See the documentation for your virtualization environment for details.
  2. Download the parcels required for the worker host from the public parcel repository, or from a repository that you have created and save them to a temporary directory. See Cloudera Manager 6 Version and Download Information.
  3. From the same location where you downloaded the parcels, download the parcel_name.parcel.sha1 file for each parcel.
  4. Calculate and compare the sha1 of the downloaded parcel to ensure that the parcel was downloaded correctly. For example:
    sha1sum KAFKA-2.0.2-1.2.0.2.p0.5-el6.parcel | awk '{print $1}' > KAFKA-2.0.2-1.2.0.2.p0.5-el6.parcel.sha
    
    diff KAFKA-2.0.2-1.2.0.2.p0.5-el6.parcel.sha1 KAFKA-2.0.2-1.2.0.2.p0.5-el6.parcel.sha
    
  5. Unpack the parcel:
    1. Create the following directories:
      • /opt/cloudera/parcels
      • /opt/cloudera/parcel-cache
    2. Set the ownership for the two directories you just created so that they are owned by the username that the Cloudera Manager agent runs as.
    3. Set the permissions for each directory using the following command:
      chmod 755 directory
      Note that the contents of these directories will be publicly available and can be safely marked as world-readable.
    4. Running as the same user that runs the Cloudera Manager agent, extract the contents of the parcel from the temporary directory using the following command:
      tar -zxvf parcelfile -C /opt/cloudera/parcels/
    5. Add a symbolic link from the product name of each parcel to the /opt/cloudera/parcels directory.
      For example, to link /opt/cloudera/parcels/CDH-6.0.0-1.cdh6.0.0.p0.309038 to /opt/cloudera/parcels/CDH, use the following command:
      ln -s /opt/cloudera/parcels/CDH-6.0.0-1.cdh6.0.0.p0.309038 /opt/cloudera/parcels/CDH
    6. Mark the parcels to not be deleted by the Cloudera Manager agent on start up by adding a .dont_delete marker file (this file has no contents) to each subdirectory in the /opt/cloudera/parcels directory. For example:
      touch /opt/cloudera/parcels/CDH/.dont_delete
  6. Verify the file exists:
    ls -l /opt/cloudera/parcels/parcelname
    You should see output similar to the following:
    ls -al /opt/cloudera/parcels/CDH
    total 100
    drwxr-xr-x  9 root root  4096 Sep 14 14:53 .
    drwxr-xr-x  9 root root  4096 Sep 14 06:34 ..
    drwxr-xr-x  2 root root  4096 Sep 12 06:39 bin
    -rw-r--r-- 1 root root 0 Sep 14 14:53 .dont_delete
    drwxr-xr-x 26 root root  4096 Sep 12 05:10 etc
    drwxr-xr-x  4 root root  4096 Sep 12 05:04 include
    drwxr-xr-x  2 root root 69632 Sep 12 06:44 jars
    drwxr-xr-x 37 root root  4096 Sep 12 06:39 lib
    drwxr-xr-x  2 root root  4096 Sep 12 06:39 meta
    drwxr-xr-x  5 root root  4096 Sep 12 06:39 share
    
  7. Install the Cloudera Manager agent. If you have not already done so, Step 1: Configure a Repository for Cloudera Manager.
  8. Create an image of the worker host. See the documentation for your virtualization environment for details.

Instantiating a Worker Host

  1. Instantiate the Cloudera worker host image.
  2. Edit the following file and set the server_host and server_port properties to reference the Cloudera Manager server host.
  3. If necessary perform additional steps to configure TLS/SSL. See Manually Configuring TLS Encryption for Cloudera Manager.
  4. Start the agent service:
    service cloudera-scm-agent start