Create and configure controller services for your data flow
Learn how to create and configure controller services for the CDW Iceberg ingest data flow.
You can add controller services to provide shared services to be used by the processors in your data flow. You will use these controller services later when you configure your processors.
- Hive Catalog Controller Service
- Kerberos User Service
See below for property details.
Configure the Hive Catalog Controller Service
Property | Description | Example value for ingest data flow |
---|---|---|
Hive Metastore URI |
Provide the URI of the metastore location. |
|
Default Warehouse Location |
Provide the default warehouse location in the HDFS file system. |
|
Hadoop Configuration Resources |
Add a comma-separated list of Hadoop Configuration files, such as hive-site.xml and core-site.xml for kerberos. Include full paths to the files so that NiFi can refer to those configuration resources from your specified path. |
/etc/hive/conf.cloudera.data_context_connector-975b/hive-site.xml,/etc/hadoop/conf.cloudera.stub_dfs/core-site.xml |
Configure the Kerberos User Service
Usethe Kerberos Password User Service so that you do not need to distribute a keytab file across the NiFi nodes of the cluster.
It is best practice to have a dedicated Machine User created in the control plane for your specific use case so that you can configure specific policies in Ranger and have better control in case of multi-tenancy with many use cases.
Property | Description | Example value for ingest data flow |
---|---|---|
Kerberos Principal |
Specify the user name that should be used for authenticating with Kerberos. Use your CDP workload username to set this Authentication property. |
srv_nifi_to_iceberg |
Kerberos Password |
Provide the password that should be used for authenticating with Kerberos. Use your CDP workload password to set this Authentication property. |
password (sensitive value) |