Configuring Kerberos

Depending on the dataflow that is used, you may need to enable Kerberos authentication in order to interact with some service(s).

To enable Kerberos, you must provide an appropriate krb5.conf file. This file tells the DataFlow Function, among other things, where the Key Distribution Center (KDC) is located and the Kerberos Realm to use.

Depending on the data flow that is used, you may need to enable Kerberos authentication in order to interact with some service(s). To enable Kerberos, you must provide an appropriate krb5.conf file. This file tells the DataFlow Function, among other things, where the Key Distribution Center (KDC) is located and the Kerberos Realm to use.

To specify this, the krb5.conf must be made available to DataFlow Function. You can do this by using an Azure Storage Container. See the corresponding section to learn more about this.

Additionally, you must tell DataFlow Function where it can find this krb5.conf file. To do this, add an environment variable to your Lambda function. The environment variable must have the name KRB5_FILE and the value must be the fully qualified filename of the krb5.conf file. For example, it might use /home/resources/krb5.conf.

In order to use Keytabs in your data flow, it will also be important to provide the necessary keytab files to the DataFlow Function. This is accomplished in the same manner, using a Lambda Layer or an S3 bucket. While the krb5.conf file is a system configuration, the keytab is not. The keytab will be configured in the data flow on a per-component basis. For example, one Processor may use keytab nifi-keytab-1 while some Controller Service makes use of keytab hive-keytab-4. The location of these files may change from deployment to deployment. For example, for an on-premises deployment, keytabs may be stored in /etc/security/keytabs while in DataFlow Function, they may be made available in /home/resources.

So it is better to use Parameter Contexts in order to parameterize the location of the keytab files. This is a good practice for any file-based resource that will need to be made available to a deployment. In this case, because there is a parameter referencing the location, we need to provide DataFlow Function with a value for that parameter. For more information on this, see Parameters.