Hadoop Authentication with Kerberos for Cloudera Data Science Workbench
Cloudera Data Science Workbench users can authenticate themselves using Kerberos against the cluster KDC defined in the host's /etc/krb5.conf file. Cloudera Data Science Workbench does not assume that your Kerberos principal is always the same as your login information. Therefore, you will need to make sure Cloudera Data Science Workbench knows your Kerberos identity when you sign in.
To authenticate against your cluster’s Kerberos KDC, go to the top-right dropdown menu, click Upload Keytab to upload the keytab file directly to Cloudera Data Science Workbench. Once successfully authenticated, Cloudera Data Science Workbench uses your stored credentials to ensure that you are secure when running your workloads.
, and enter your Kerberos principal. To authenticate, either enter your password or clickWhen you authenticate with Kerberos, Cloudera Data Science Workbench will store your keytab in an internal database. When you subsequently run an engine, the keytab is used by a Cloudera Data Science Workbench sidecar container to generate ticket-granting tickets for use by your code. Ticket-granting tickets allow you to access resources such as Spark, Hive, and Impala, on Kerberized CDH clusters.
While you can view your current ticket-granting ticket by typing klist in an engine terminal, there is no way for you or your code to view your keytab. This prevents malicious code and users from stealing your keytab.
UI Behavior for Non-Kerberized Clusters
The contents of the Hadoop Authentication tab change depending on whether the cluster is kerberized. For a secure cluster with Kerberos enabled, the Hadoop Authentication tab displays a Kerberos section with fields to enter your Kerberos principal and username. However, if Cloudera Data Science Workbench cannot detect a krb5.conf file on the host, it will assume the cluster is not kerberized, and the Hadoop Authentication tab will display Hadoop Username Override configuration instead.
For a non-kerberized cluster, by default, your Hadoop username will be set to your Cloudera Data Science Workbench login username. To override this default and set an alternative HADOOP_USER_NAME, go to the Hadoop Username Override setting at .
If the Hadoop Authentication tab is incorrectly displaying Kerberos configuration fields for a non-kerberized cluster, make sure the krb5.conf file is not present on the host running Cloudera Data Science Workbench. If you do find any instances of krb5.conf on the host, run cdsw stop, remove the krb5.conf file(s), and run cdsw start. You should now see the expected Hadoop Username Override configuration field.
Limitations
-
Cloudera Data Science Workbench only supports Active Directory and MIT KDCs. PowerBroker-equipped Active Directory is not supported.
-
Cloudera Data Science Workbench does not support the use of Kerberos plugin modules in krb5.conf.