Depending on the data flow that is used, you may need to enable Kerberos authentication to interact with some service(s).
To enable Kerberos, you must provide an appropriate krb5.conf file. This file tells the DataFlow function where the Key Distribution Center (KDC) is located and which Kerberos Realm to use.
To specify this, make the krb5.conf file available to DataFlow Function. You can do it by using a GCS bucket. See the corresponding section to learn more about this.
Additionally, you must tell DataFlow Function where it can find this
krb5.conf file. To do this, add an environment variable to your
Lambda function. The environment variable must have the name KRB5_FILE and the value must be
the fully qualified file name of the krb5.conf file. For example, it
In order to use Keytabs in your data flow, it is also important to provide the
necessary keytab files to the DataFlow Function. You can accomplish it by using a GCS
bucket. While the krb5.conf file is a system configuration, the keytab
is not. The keytab is configured in the data flow on a per-component basis. For example, one
Processor may use keytab
nifi-keytab-1 while some Controller Service makes
use of keytab
hive-keytab-4. The location of these files may change from
deployment to deployment. For example, for an on-premises deployment, keytabs may be stored
/etc/security/keytabs while in DataFlow Function, they may be made
available in /tmp/resources.
So it is better to use Parameter Contexts to parameterize the location of the keytab files. This is a good practice for any file-based resource that needs to be made available to a deployment. In this case, because there is a parameter referencing the location, you need to provide DataFlow Function with a value for that parameter. For more information, see Parameters.