Sqoop client packages

Use these steps to install client RPMs and configure Sqoop for connecting to a Cloudera cluster from an unmanaged node.

Prerequisites

  • Infrastructure: Unmanaged host running one of these operating systems:

    • RHEL or RHEL-compatible (using yum)

    • SUSE Linux Enterprise Server (SLES) (using zypper)

    • Ubuntu or Debian (using apt)

  • Java Development Kit: Full JDK (not just JRE) installed on the unmanaged node, matching the version used on the managed cluster.

Step 1: Set up Kerberos and Java Key Stores

If your cluster uses Kerberos for authentication (highly recommended for secure environments):

  1. Install Java Key Stores and Trust Stores.

    Import your Java key-store (.jks) and trust-store files as required by your cluster security configuration.

  2. Validate Kerberos Client Installation

    Ensure the unmanaged node has Kerberos utilities (krb5-workstation or appropriate packages) installed and configured to communicate with the cluster’s Kerberos KDC.

Step 2: Obtain and Copy Configuration Files

To ensure consistent configuration, copy the necessary service configuration files from a managed Cloudera Manager host to unmanaged nodes.

  1. On a managed node with the related components installed (typically under /etc), locate the following config directories:

    • /etc/hadoop/conf

    • /etc/hive/conf (if using Hive with Sqoop)

    • /etc/hbase/conf (if using HBase with Sqoop)

  2. Copy these directories, preserving ownership and permissions, to the same locations on the unmanaged node.

Step 3: Sqoop-Specific configuration and setup

3.1 Install Java Development Kit (JDK)
  • Install the full Java Development Kit (JDK) on the unmanaged node, matching the version used in the managed cluster.
  • The JDK is required for Sqoop to function properly.
3.2 Install Sqoop and Dependencies
  • Use the unmanaged node’s default package manager to install the Sqoop client along with its dependencies:
    • For RHEL/CentOS:
      sudo yum install sqoop-client
    • For Ubuntu/Debian:
      sudo apt-get install sqoop-client
    • For SLES:
      sudo zypper install sqoop-client
3.3 Install JDBC Drivers

Place all necessary JDBC driver files required by Sqoop into:

/usr/lib/sqoop/lib/