What is Apache NiFi?

At its core, NiFi serves as an automation tool for facilitating seamless data transfer between different systems. In the context of this documentation, the term ‘data flow’ is distinctly defined as the automated and controlled flow of information between systems. This challenge has been a persistent aspect of enterprise operations since the inception of complex systems landscapes, where certain systems act as data generators while others serve as data consumers.

NiFi components

NiFi provides a set of components to build data flows.

Processors

Processors represent the primary building blocks of data flows, designed to efficiently carry out specific tasks, such as data extraction, routing, transformation, filtering, and ingestion.

Controller services

Controller services are shared across multiple processors within a data flow, simplifying the configuration of processors. For example, they can manage connections to databases through JDBC, streamlining processor configuration.

Parameter providers

Parameter providers enable NiFi to retrieve parameter values from external sources, streamlining parameter management. Parameter contexts can be configured with parameters, and parameters make flow configurations reusable across multiple environments, making it easier to create more generic and portable data flows.

Reporting tasks

Reporting tasks assist in monitoring NiFi and the deployed data flows.

Cloudera Flow Management & Cloudera DataFlow or Apache NiFi?

Both Cloudera Flow Management (CFM) and Cloudera DataFlow for Public Cloud (CDF-PC) offer a comprehensive suite of additional capabilities built on top of Apache NiFi. Additionally, Cloudera also provides an extensive set of NiFi components that are not included in the standard Apache NiFi distribution. This collection contains components drawn from the Apache NiFi project and Cloudera-exclusive components designed to cater for a wide range of industry-specific use cases. This website provides documentation for all components delivered as part of Cloudera Flow Management and Cloudera DataFlow.