Developing a flow for DataFlow Functions

A flow definition represents the data flow logic that you can download from NiFi, import to the DataFlow Catalog and run in serverless mode. You can develop this data flow for your function in any development environment using Apache NiFi and then deploy the function on a Function as a Service (FAAS) solution on AWS, Azure, or the Google Cloud Platform (GCP).

You have two main options for developing your flow:
  • Use Cloudera Data Platform (CDP) Public Cloud Data Hub with the Flow Management template, if you are a CDP customer who has a CDP Data Lake.

    For more information on how to set up a managed and secured Flow Management cluster in CDP Public Cloud, see Creating your first Flow Management cluster.

  • Develop the data flow in your local development environment using open source Apache NiFi.

Once you have developed and test your NiFi flow, you can deploy it as a DataFlow function in serverless mode using one of the three cloud providers function services: AWS Lambda, Google Cloud Functions, and Azure Functions.

To make sure that your NiFi data flow can be deployed as a function in CDF, review your traditional NiFi flow development process and follow the best practices outlined in the next sections.