Adding Functionality to Apache NiFi
Also available as:
PDF
loading table of contents...

Reporting Processor Activity

Processors are responsible for reporting their activity so that users are able to understand what happens to their data. Processors should log events via the ComponentLog, which is accessible via the InitializationContext or by calling the getLogger method of AbstractProcessor.

Additionally, Processors should use the ProvenanceReporter interface, obtained via the ProcessSession's getProvenanceReporter method. The ProvenanceReporter should be used to indicate any time that content is received from an external source or sent to an external location. The ProvenanceReporter also has methods for reporting when a FlowFile is cloned, forked, or modified, and when multiple FlowFiles are merged into a single FlowFile as well as associating a FlowFile with some other identifier. However, these functions are less critical to report, as the framework is able to detect these things and emit appropriate events on the Processor's behalf. Yet, it is a best practice for the Processor developer to emit these events, as it becomes explicit in the code that these events are being emitted, and the developer is able to provide additional details to the events, such as the amount of time that the action took or pertinent information about the action that was taken. If the Processor emits an event, the framework will not emit a duplicate event. Instead, it always assumes that the Processor developer knows what is happening in the context of the Processor better than the framework does. The framework may, however, emit a different event. For example, if a Processor modifies both the content of a FlowFile and its attributes and then emits only an ATTRIBUTES_MODIFIED event, the framework will emit a CONTENT_MODIFIED event. The framework will not emit an ATTRIBUTES_MODIFIED event if any other event is emitted for that FlowFile (either by the Processor or the framework). This is due to the fact that all Provenance Events know about the attributes of the FlowFile before the event occurred as well as those attributes that occurred as a result of the processing of that FlowFile, and as a result the ATTRIBUTES_MODIFIED is generally considered redundant and would result in a rendering of the FlowFile lineage being very verbose. It is, however, acceptable for a Processor to emit this event along with others, if the event is considered pertinent from the perspective of the Processor.