Session Rollback
Thus far, when we have discussed the ProcessSession
, we have
typically referred to it simply as a mechanism for accessing FlowFiles. However, it
provides another very important capability, which is transactionality. All methods that
are called on a ProcessSession happen as a transaction. When we decided to end the
transaction, we can do so either by calling commit()
or by calling
rollback()
. Typically, this is handled by the
AbstractProcessor
class: if the onTrigger
method
throws an Exception, the AbstractProcessor will catch the Exception, call
session.rollback()
, and then re-throw the Exception. Otherwise, the
AbstractProcessor will call commit()
on the ProcessSession.
There are times, however, that developers will want to roll back a session explicitly.
This can be accomplished at any time by calling the rollback()
or
rollback(boolean)
method. If using the latter, the boolean indicates
whether or not those FlowFiles that have been pulled from queues (via the ProcessSession
get
methods) should be penalized before being added back to their
queues.
When rollback
is called, any modification that has occurred to the
FlowFiles in that session are discarded, to included both content modification and
attribute modification. Additionally, all Provenance Events are rolled back (with the
exception of any SEND event that was emitted by passing a value of true
for the force
argument). The FlowFiles that were pulled from the input
queues are then transferred back to the input queues (and optionally penalized) so that
they can be processed again.
On the other hand, when the commit
method is called, the
FlowFile's new state is persisted in the FlowFile Repository, and any Provenance
Events that occurred are persisted in the Provenance Repository. The previous content is
destroyed (unless another FlowFile references the same piece of content), and the
FlowFiles are transferred to the outbound queues so that the next Processors can operate
on the data.
It is also important to note how this behavior is affected by using the
org.apache.nifi.annotations.behavior.SupportsBatching
annotation. If
a Processor utilizes this annotation, calls to ProcessSession.commit
may not take affect immediately. Rather, these commits may be batched together in order to
provide higher throughput. However, if at any point, the Processor rolls back the
ProcessSession, all changes since the last call to commit
will be
discarded and all "batched" commits will take affect. These "batched"
commits are not rolled back.