Trace Data Lifecycle

Learn about using data provenance in Cloudera Flow Management.

You can trace the life of data, its origin, transformation, and destination using data provenance. You can use this information to troubleshoot and evaluate data flow compliance and optimization in real time.

NiFi keeps a very granular level of detail about each piece of data that it ingests. As the data objects are processed through the system, NiFi records and indexes data provenance details, and this information is stored in NiFi’s Provenance Repository. By default, NiFi updates this information every five minutes, but that is configurable.

To search and view this information, select Data Provenance from the Global Menu. This enables you to see the most recent data provenance information, search the information for specific items, and filter the search results. It is also possible to open additional dialog windows to see event details, replay data at any point within the dataflow, and see a graphical representation of the data’s lineage, or path through the flow.

For more information about data provenance, see the Apache NiFi User Guide.