HDFS client packages
Use these steps to install client RPMs and configure HDFS for connecting to a Cloudera cluster from an unmanaged node.
Prerequisites
-
Infrastructure: Unmanaged host running one of these operating systems:
-
RHEL or RHEL-compatible (using yum)
-
SUSE Linux Enterprise Server (SLES) (using zypper)
-
Ubuntu or Debian (using apt)
-
- Java Development Kit: Full JDK (not just JRE) installed on the unmanaged node, matching the version used on the managed cluster.
Step 1: Set up Kerberos and Java Key Stores
If your cluster uses Kerberos for authentication (highly recommended for secure environments):
-
Install Java Key Stores and Trust Stores.
Import your Java key-store (.jks) and trust-store files as required by your cluster security configuration.
-
Validate Kerberos Client Installation
Ensure the unmanaged node has Kerberos utilities (krb5-workstation or appropriate packages) installed and configured to communicate with the cluster’s Kerberos KDC.
Step 2: Obtain and Copy Configuration Files
To ensure consistent configuration, copy the necessary service configuration files from a managed Cloudera Manager host to unmanaged nodes.
-
On a managed node with the related components installed (typically under /etc), locate /etc/hadoop/conf config directory.
-
Copy this directory, preserving ownership and permissions, to the same location on the unmanaged node.
Step 3: HDFS-Specific configuration and setup
- Install the full Java Development Kit (JDK) on the unmanaged node, matching the version used in the managed cluster.
- The JDK is required for HDFS to function properly.
- Use the unmanaged node’s default package manager to install the HDFS client along
with its dependencies:
- For
RHEL/CentOS:
sudo yum install hdfs-client
- For
Ubuntu/Debian:
sudo apt-get install hdfs-client
- For
SLES:
sudo zypper install hdfs-client
- For
RHEL/CentOS: