Upload and run Python scripts in flow deployments

If running your data flow requires executing a Python script, you have to upload it when creating your data flow deployment through the Deployment Wizard or the CLI. Follow these steps to configure your NiFi processors correctly and upload your Python script.

  1. Create your Python script and save it as a file.
    For example:
    print("Hello, World!")
  2. Open the flow definition which requires a Python script in NiFi.
  3. Add and configure an ExecuteStreamCommand processor to run your script.
    Make the following property settings:
    Command Arguments
    provide #{Script}
    Command Path
    provide python
    Leave all other properties with their default values.
  4. Add and configure a processor that allows uploading file-based resources as part of its deployment in CDF-PC.

    This procedure uses FetchHDFS as an example. For this processor make the following configuration:

    Hadoop Configuration Resources
    provide #{Script}

    Leave all other parameters with their default values.

  5. Download the data flow as a flow definition from NiFi and import it to Cloudera DataFlow.
  6. Initiate a flow deployment from the Catalog. In the Parameters step of the Deployment Wizard, upload your Python script to the Script parameter. Complete the Wizard and submit your deployment request.
Your Python script is uploaded to the flow deployment and executed as part of the data flow.