If running your data flow requires executing a Python script, you have to upload it
when creating your data flow deployment through the Deployment Wizard or the CLI. Follow
these steps to configure your NiFi processors correctly and upload your Python
script.
Create your Python script and save it as a file.
For
example:
#!/usr/bin/python3
print("Hello, World!")
Open the flow definition which requires a Python script in NiFi.
Add and configure an ExecuteStreamCommand processor to run
your script.
Make the following property settings:
Command Arguments
provide #{Script}
Command Path
provide python
Leave all other properties with their default values.
Add and configure a processor that allows uploading file-based resources as
part of its deployment in CDF-PC.
This procedure uses FetchHDFS as an example. For this
processor make the following configuration:
Hadoop Configuration Resources
provide #{Script}
Leave all other parameters with their default values.
Download the data flow as a flow definition from NiFi and import it to
Cloudera DataFlow.
Initiate a flow deployment from the Catalog. In the
Parameters step of the
Deployment Wizard, upload your Python script to
the Script parameter. Complete the
Wizard and submit your deployment request.
Your Python script is uploaded to the flow deployment and
executed as part of the data flow.