Encrypted Provenance Repository
While OS-level access control can offer some security over the provenance data written to the disk in a repository, there are scenarios where the data may be sensitive, compliance and regulatory requirements exist, or NiFi is running on hardware not under the direct control of the organization (cloud, etc.). In this case, the provenance repository allows for all data to be encrypted before being persisted to the disk.
The current implementation of the encrypted provenance repository intercepts the record
writer and reader of WriteAheadProvenanceRepository
, which offers
significant performance improvements over the legacy
PersistentProvenanceRepository
and uses the AES/GCM
algorithm, which is fairly performant on commodity hardware. In most scenarios, the added
cost will not be significant (unnoticable on a flow with hundreds of provenance events per
second, moderately noticable on a flow with thousands - tens of thousands of events per
second). However, administrators should perform their own risk assessment and performance
analysis and decide how to move forward. Switching back and forth between
encrypted/unencrypted implementations is not recommended at this time.