Downloading and viewing the predefined Stateless NiFi dataflows shipped in CDP

Learn how to download and view the predefined dataflows used by the Stateless NiFi-based connectors shipped in CDP. Predefined dataflows are downloaded from CDP cluster hosts. The downloaded files can be uploaded to NiFi to examine the dataflow's structure.

A number connectors shipped for use with CDP are based on Stateless NiFi and run predefined dataflows (dataflows developed by Cloudera). For example, the JDBC Source connector is a Stateless NiFi-based connector that runs a predefined dataflow.

To better understand how Stateless NiFi-based connectors like JDBC Source work, and how they move and manipulate Kafka data, it can be useful to view the dataflow that the connectors use.

These dataflows are installed on every cluster host that is running a Kafka Connect service role. You can fetch the dataflows from your cluster hosts, load them into NiFi, and look at their flow structure.

The flow definition JSON files of these dataflows are present on the cluster hosts at /etc/kafka_connect_ext/flow-definitions. However, you cannot simply load these flow definition files as is into NiFi. They must be edited first.

  • You have access to a running CDP cluster that has a Kafka service and Kafka Connect service roles deployed.
  • You have access to a running instance of NiFi. The NiFi instance does not need to run on a cluster. A standalone instance running on any machine, like your own computer, is sufficient.
  1. Fetch the flow definition of the chosen connector from the cluster.
    The flow definition files are located on the cluster hosts at /etc/kafka_connect_ext/flow-definitions. How you complete this step depends on your cluster environment and what utilities are available to you. For example, if using scp, you can run the following command.
    scp [***USER***]@[***CLUSTER HOST***]:/etc/kafka_connect_ext/flow-definitions/[***CONNECTOR NAME***]-[***VERSION***].json .
  2. Open .json file in a text editor.
  3. Copy contents of the "flow" JSON element to a separate .json file.
  4. In the .json you created, replace the value of each "version" element with the version of NiFi you will use to view the dataflow.
  5. In the NiFi UI, drag a new process group to the canvas.
  6. In the Add Process Group modal, click and browse for your .json file.
  7. Click Add.
The dataflow is imported to the process group with all of its parameters.

After the dataflow is imported, you can view the structure of the flow to gain a better understanding of how the dataflow operates. If you want to, you can make modifications and redeploy the modified dataflow as a custom connector.