Configure the processor for merging records
Learn how you can configure the MergeRecord processor for your S3 ingest data flow. You can use it to merge together multiple record-oriented flow files into a large flow file that contains all records of your Kafka data input.
-
Launch the Configure Processor window, by right clicking the
MergeRecord processor and selecting
Configure.This gives you a configuration dialog with the following tabs: Settings, Scheduling, Properties, Comments.
- Configure the processor according to the behavior you expect in your data flow.
- When you have finished configuring the options you need, save the changes by
clicking the Apply button.
Make sure that you set all required properties, as you cannot start the processor until all mandatory properties have been configured.
In this example the following settings and properties are used:
Table 1. MergeRecord processor scheduling Scheduling Description Example value for ingest data flow Automatically Terminate Relationships original Table 2. MergeRecord processor properties Property Description Example value for ingest data flow RecordReader
Specify the Controller Service to use for reading incoming data.
ReadCustomerCSVRoot
RecordWriter
Specify the Controller Service to use for writing out the records.
CSVRecordSetWriter
Merge Strategy
Specify the algorithm used to merge records.
The Bin-Packing Algorithm generates a FlowFile populated by arbitrarily chosen FlowFiles.
Bin-Packing Algorithm
Minimum Number of Records
Specify the minimum number of records to include in a bin.
900
Maximum Number of Records
Specify the maximum number of Records to include in a bin.
1000
For a complete list of MergeRecord properties, see the processor documentation.