Developing a NiFi flow for DataFlow Functions

A flow definition represents the data flow logic that you can download from NiFi, import to the DataFlow Catalog and run in serverless mode. You can develop this data flow for your function in any development environment using Apache NiFi and then deploy the function on a Function as a Service (FAAS) solution on AWS, Azure, or the Google Cloud Platform (GCP).

You have three main options for developing your flow:
  • Use CDF Flow Designer and publish your draft as a flow definition in the Catalog. For more information on using the Flow Designer, see Creating a new draft. In this case you can skip the downloading flow and importing flow to Catalog steps as you have already added your flow to the CDF Catalog.
  • Use Cloudera Data Platform (CDP) Public Cloud Data Hub with the Flow Management template, if you are a CDP customer who has a CDP Data Lake.

    For more information on how to set up a managed and secured Flow Management cluster in CDP Public Cloud, see Setting up your Flow Management cluster.

  • Develop the data flow in your local development environment using open source Apache NiFi.

Once you have developed and tested your NiFi flow, you can deploy it as a DataFlow function in serverless mode using one of the three cloud providers function services: AWS Lambda, Google Cloud Functions, and Azure Functions.

To make sure that your NiFi data flow can be deployed as a function in CDF, review your traditional NiFi flow development process and follow the best practices outlined in the next sections.