Configure the processor for your data target
Learn how you can configure the data target processor for your ADLS ingest data flow. This example assumes that you are moving data to Azure Data Lake Storage and shows you how to configure the corresponding processors.
-
Launch the Configure Processor window by right clicking the
processor you added for writing data to ADLS (PutHDFS or
PutAzureDataLakeStorage) and selecting
Configure.
This gives you a configuration dialog with the following tabs: Settings, Scheduling, Properties, Comments.
- Configure the processor according to the behavior you expect in your data
flow.
Make sure that you set all required properties, as you cannot start the processor until all mandatory properties have been configured.
- When you have finished configuring the options you need, save the changes by
clicking the Apply button.
In this example, the following properties are used for PutHDFS:
Table 1. PutHDFS processor properties Property Description Example value for ingest data flow Hadoop Configuration Resources
Specify the path to the
core-site.xml
configuration file.Make sure that the default file system (fs, default.FS) points to the ADLS bucket you are writing to.
/etc/hadoop/conf.cloudera.core_settings/core-site.xml
Kerberos Principal
Specify the Kerberos principal (your CDP username) to authenticate against CDP.
srv_nifi-adls-ingest
Kerberos Password
Provide the password that should be used for authenticating with Kerberos.
password
Directory
Provide the path to your target directory in Azure expressed in an
abfs
compatible path format.abfs://<YourFileSystem>@<YourStorageAccount>.dfs.core.windows.net/<TargetPathWithinFileSystem>
You can leave all other properties as default configurations.
For a complete list of PutHDFS properties, see the processor documentation.
If you want to use the PutAzureDataLakeStorage processor to store the data in ADLS, you have to configure your Azure connection in a secure way by providing:
- Storage Account Name - the name of your Azure storage account that holds the containers where you want to write to)
- Storage Account Key or your SAS Token - the authentication key that allows
you to write data to the Azure storage account
- Filesystem Name - the name of your Azure storage account file system where you want to write to
- Directory Name - the name of the folder within your filesystem where you want to write to
Make sure that you set all required properties, as you cannot start the processor until all mandatory properties have been configured.