Security in Hadoop
With Hadoop’s new security features and its integration with Kerberos, it is possible to verify that the user is who they claim to be and ensure they only have the correct access to data or resources. This allows corporations to allow finer grained access to information and reduce their operational overhead by coalescing their distinct clusters.
Secure Hadoop clusters provide solutions for the following threats:
Prevent unauthorized access to HDFS and MapReduce communication
Prevent unauthorized access to the jobs submitted through Oozie
Prohibit the fraudulent servers to access your Hadoop cluster
Prevent impersonation attacks
Prevent access to root accounts
Deployment options for secure Hadoop cluster
Depending on your environment set-up, following are the two different options to install a secure Hadoop cluster:
OPTION I: Set-up a new Kerberos Key Distribution Center
Use the auxiliary script -
setupKerberos.sh
. This auxiliary script file is responsible for performing following tasks:Sets up a new Key Distribution Center (KDC) on the host machine specified in
kdcserver
file.Creates service keytabs for all processes - NameNode, JobTracker, Secondary NameNode, DataNodes, TaskTrackers, HBase Master, HBase Regionserver, and Hive Metastore
Places all the service keytabs (for respective hosts) under
/etc/security/keytabs
directoryGenerates user keytabs for
HDFS
andSmoke Test
users and places these keytab files to/tmp
directory on all the nodes.
OPTION II: Add existing Kerberos Key Distribution Center
You also have the option of adding an existing Kerberos Key Distribution Center for your Hadoop cluster.