Session Rollback
Thus far, when we have discussed the ProcessSession
, we have
typically referred to it simply as a mechanism for accessing FlowFiles. However, it provides
another very important capability, which is transactionality. All methods that are called on
a ProcessSession happen as a transaction. When we decided to end the transaction, we can do
so either by calling commit()
or by calling
rollback()
. Typically, this is handled by the
AbstractProcessor
class: if the onTrigger
method
throws an Exception, the AbstractProcessor will catch the Exception, call
session.rollback()
, and then re-throw the Exception. Otherwise, the
AbstractProcessor will call commit()
on the ProcessSession.
There are times, however, that developers will want to roll back a session explicitly.
This can be accomplished at any time by calling the rollback()
or
rollback(boolean)
method. If using the latter, the boolean indicates
whether or not those FlowFiles that have been pulled from queues (via the ProcessSession
get
methods) should be penalized before being added back to their
queues.
When rollback
is called, any modification that has occurred to the
FlowFiles in that session are discarded, to included both content modification and attribute
modification. Additionally, all Provenance Events are rolled back (with the exception of any
SEND event that was emitted by passing a value of true
for the
force
argument). The FlowFiles that were pulled from the input queues
are then transferred back to the input queues (and optionally penalized) so that they can be
processed again.
On the other hand, when the commit
method is called, the
FlowFile's new state is persisted in the FlowFile Repository, and any Provenance Events
that occurred are persisted in the Provenance Repository. The previous content is destroyed
(unless another FlowFile references the same piece of content), and the FlowFiles are
transferred to the outbound queues so that the next Processors can operate on the
data.
It is also important to note how this behavior is affected by using the
org.apache.nifi.annotations.behavior.SupportsBatching
annotation. If a
Processor utilizes this annotation, calls to ProcessSession.commit
may
not take affect immediately. Rather, these commits may be batched together in order to
provide higher throughput. However, if at any point, the Processor rolls back the
ProcessSession, all changes since the last call to commit
will be
discarded and all "batched" commits will take affect. These "batched"
commits are not rolled back.