Configure the processor for your data target
Learn how to configure a data target processor for the S3 ingest data flow.
You can set up a data flow to move data to many locations. This example assumes that you are moving data to AWS S3 and shows you how to configure the corresponding processors. If you want to move data to a different location, review our other use cases in the Cloudera Data Flow for Data Hub library.
Launch the Configure Processor window by right clicking the
processor you added for writing data to S3 (PutHDFSor
PutS3Object) and selecting
Configure.This gives you a configuration dialog with the following tabs: Settings, Scheduling, Properties, Comments.
- Configure the processor according to the behavior you expect in your data
Make sure that you set all required properties, as you cannot start the processor until all mandatory properties have been configured.
- When you have finished configuring the options you need, save the changes by
clicking the Apply button.
In this example, the following properties are used for PutHDFS:
Table 1. PutHDFS processor properties Property Description Example value for ingest data flow
Hadoop Configuration Resources
Specify the path to the
Make sure that the default file system (fs, default.FS) points to the S3 bucket you are writing to.
Specify the Kerberos principal (your username) to authenticate against CDP.
Provide the password that should be used for authenticating with Kerberos.
Provide the path to your target directory in AWS expressed in an S3A compatible path.
You can leave all other properties as default configurations.
For a complete list of PutHDFS properties, see the processor documentation.
If you want to use the PutS3Object processor to store the data in S3, you have to configure your S3 connection in a secure way:
- You add the AWS access key or secret access key as properties of the processor
- You configure these access keys in a credentials file and add that as a property of the processor
- You use a AWS Credentials provider service and configure it with the required information for authenticating against AWS.