Best Practices for Developing Flow Definitions

Before you can deploy a flow definition in Cloudera DataFlow (CDF), you need to develop your data flow logic in a development environment using Apache NiFi. To make sure that your NiFi data flow can be deployed in CDF, make sure to follow the best practices outlined in this section.

What is a flow definition?

A flow definition represents data flow logic which was developed in Apache NiFi and exported by using the Download Flow Definition action on a NiFi process group or the root canvas. Flow definitions typically leverage parameterization to make them portable between different environments such as development or production NiFi environments.

To run one of your existing NiFi data flows in CDF you have to export it as a flow definition and upload it to the CDF catalog.

What is flow isolation?

CDF is the first cloud service allowing NiFi users to easily isolate data flows from each other and guarantee a set of resources to each one without requiring administrators to create additional NiFi clusters.

Flow isolation describes the ability to treat NiFi process groups which typically run on a shared cluster on shared resources as independent, deployable artifacts which can be exported as flow definitions from NiFi.

Flow isolation is useful when

  • You want to guarantee a set of resources for a specific dataflow.

  • You want to isolate failure domains.