2. Install Implications for Deploying Secure Hadoop Clusters

Security in Hadoop

With Hadoop’s new security features and its integration with Kerberos, it is possible to verify that the user is who they claim to be and ensure they only have the correct access to data or resources. This allows corporations to allow finer grained access to information and reduce their operational overhead by coalescing their distinct clusters.

Secure Hadoop clusters provide solutions for the following threats:

  • Prevent unauthorized access to HDFS and MapReduce communication

  • Prevent unauthorized access to the jobs submitted through Oozie

  • Prohibit the fraudulent servers to access your Hadoop cluster

  • Prevent impersonation attacks

  • Prevent access to root accounts

Deployment options for secure Hadoop cluster

Depending on your environment set-up, following are the two different options to install a secure Hadoop cluster:

  • OPTION I: Set-up a new Kerberos Key Distribution Center

    Use the auxiliary script - setupKerberos.sh. This auxiliary script file is responsible for performing following tasks:

    • Sets up a new Key Distribution Center (KDC) on the host machine specified in kdc­server file.

    • Creates service keytabs for all processes - NameNode, JobTracker, Secondary NameNode, DataNodes, TaskTrackers, HBase Master, HBase Regionserver, and Hive Metastore

    • Places all the service keytabs (for respective hosts) under /etc/security/keytabs directory

    • Generates user keytabs for HDFS and Smoke Test users and places these keytab files to /tmp directory on all the nodes.

  • OPTION II: Add existing Kerberos Key Distribution Center

    You also have the option of adding an existing Kerberos Key Distribution Center for your Hadoop cluster.


loading table of contents...