Processors
The current implementation of MiNiFi supports multiple processors. The "Processors" subsection is a list of these processors. Each processor must specify these properties. They are the basic configuration general to all processor implementations. Make sure that all relationships for a processor are accounted for in the auto-terminated relationship list or are used in a connection.
Property | Description |
---|---|
name | The name of what this processor will do. This is not used for any underlying implementation but solely for the users of this configuration and MiNiFi agent. |
id | The id of this processor. This can be omitted but in processors without this field, there should not be any duplicate names and connections will need to specify source and destination name instead of id. If set it should be a filesystem-friendly value (regex: [A-Za-z0-9_-]+) |
class | The fully qualified java class name of the processor to run. For example for the standard TailFile processor it would be: org.apache.nifi.processors.standard.TailFile |
max concurrent tasks | The maximum number of tasks that the processor will use. |
scheduling strategy | The strategy for executing the processor. Valid options are
CRON_DRIVEN or TIMER_DRIVEN |
scheduling period | This property expects different input depending on the scheduling strategy
selected. For the TIMER_DRIVEN scheduling strategy, this value is a
time duration specified by a number followed by a time unit. For example, 1 second
or 5 mins. The default value of 0 sec means that the Processor should run as often
as possible as long as it has data to process. This is true for any time duration of
0, regardless of the time unit (i.e., 0 sec, 0 mins, 0 days). For an explanation of
values that are applicable for the CRON driven scheduling strategy, see the
description of the CRON driven scheduling strategy in the scheduling tab section of
the NiFi User documentation. |
penalization period | Specifies how long FlowFiles will be penalized. |
yield period | In the event the processor cannot make progress it should yield
which will prevent the processor from being scheduled to run for some period of
time. That period of time is specific using this property. |
run duration nanos | If the processor supports batching this property can be used to control how long the Processor should be scheduled to run each time that it is triggered. Smaller values will have lower latency but larger values will have higher throughput. This period should typically only be set between 0 and 2000000000 (2 seconds). |
auto-terminated relationships list | A YAML list of the relationships to auto-terminate for the processor. |
annotation data | Some processors make use of "Annotation Data" in order to do more complex configuration, such as the Advanced portion of UpdateAttribute. This data will be unique to each implementing processor and more than likely will not be written out manually. |