Configuring Authentication in CDH Using the Command Line
The security features in CDH 5 enable Hadoop to prevent malicious user impersonation. The Hadoop daemons leverage Kerberos to perform user authentication on all remote procedure calls (RPCs). Group resolution is performed on the Hadoop master nodes, NameNode, JobTracker and ResourceManager to guarantee that group membership cannot be manipulated by users. Map tasks are run under the user account of the user who submitted the job, ensuring isolation there. In addition to these features, new authorization mechanisms have been introduced to HDFS and MapReduce to enable more control over user access to data.
The security features in CDH 5 meet the needs of most Hadoop customers because typically the cluster is accessible only to trusted personnel. In particular, Hadoop's current threat model assumes that users cannot:
- Have root access to cluster machines.
- Have root access to shared client machines.
- Read or modify packets on the network of the cluster.
Continue reading:
- Enabling Kerberos Authentication for Hadoop Using the Command Line
- Step 1: Install CDH 5
- Step 2: Verify User Accounts and Groups in CDH 5 Due to Security
- Step 3: If you are Using AES-256 Encryption, Install the JCE Policy File
- Step 4: Create and Deploy the Kerberos Principals and Keytab Files
- Step 5: Shut Down the Cluster
- Step 6: Enable Hadoop Security
- Step 7: Configure Secure HDFS
- Optional Step 8: Configuring Security for HDFS High Availability
- Optional Step 9: Configure secure WebHDFS
- Optional Step 10: Configuring a secure HDFS NFS Gateway
- Step 11: Set Variables for Secure DataNodes
- Step 12: Start up the NameNode
- Step 12: Start up a DataNode
- Step 14: Set the Sticky Bit on HDFS Directories
- Step 15: Start up the Secondary NameNode (if used)
- Step 16: Configure Either MRv1 Security or YARN Security
- Flume Authentication
- HBase Authentication
- HCatalog Authentication
- Hive Authentication
- HiveServer2 Security Configuration
- Enabling Kerberos Authentication for HiveServer2
- Encrypted Communication with Client Drivers
- Using LDAP Username/Password Authentication with HiveServer2
- Configuring LDAPS Authentication with HiveServer2
- Pluggable Authentication
- Trusted Delegation with HiveServer2
- HiveServer2 Impersonation
- Securing the Hive Metastore
- Disabling the Hive Security Configuration
- Hive Metastore Server Security Configuration
- Using Hive to Run Queries on a Secure HBase Server
- HiveServer2 Security Configuration
- HttpFS Authentication
- Hue Authentication
- Impala Authentication
- Enabling Kerberos Authentication for Impala
- Enabling LDAP Authentication for Impala
- Using Multiple Authentication Methods with Impala
- Configuring Impala Delegation for Hue and BI Tools
- Llama Authentication
- Oozie Authentication
- Search Authentication
- ZooKeeper Authentication
- FUSE Kerberos Configuration
- Using kadmin to Create Kerberos Keytab Files
- Configuring the Mapping from Kerberos Principals to Short Names
- Enabling Debugging Output for the Sun Kerberos Classes