Configuring the Python environment for SQL Stream Builder

Starting in Cloudera Streaming Analytics 1.16.0, the Python User Defined Functions (UDF) functionality is enabled by default. Cloudera Manager and the related Cloudera Streaming Analytics scripts automatically configure the Python environment and install the necessary Python modules during the upgrade process.

However, manual configuration is required in the following scenario:

  • Custom Python Environments: If you have a custom-configured Python environment, Cloudera Manager does not overwrite your modified configuration parameters during the upgrade. You must manually update the Python paths to point to the new environment.

Manual configuration steps

If your environment meets the criteria above, perform the following steps to ensure the Python runtime is correctly configured for Cloudera SQL Stream Builder.

1. Reset Python parameters in SSB

To ensure Cloudera SQL Stream Builder uses the correct Python version (Python 3.11), you must reset the executable paths in Cloudera Manager.

  1. Log in to the Cloudera Manager Admin Console.
  2. Navigate to Clusters > SQL Stream Builder.
  3. Click the Configuration tab.
  4. Search for the following parameters and reset them to the default value of /usr/bin/python3.11:
    • Python Client Executable (ssb.python.client.executable)
    • Python Executable (ssb.python.executable)
  5. Click Save Changes.
  6. In Cloudera Manager, navigate to the Flink service.
  7. Click the Instances tab.
  8. Click Add Role Instances.
  9. Select the Flink Gateway role.
  10. Assign the Flink Gateway role to the Worker hostgroup (or all hosts that will run PyFlink jobs).
  11. Click Continue and follow the prompts to add the roles.
2. Deploy Client Configuration

After updating the roles, you must deploy the client configuration to trigger the installation of the PyFlink Python module.

  1. Navigate back to the Flink service in Cloudera Manager.
  2. Click the Actions button and select Deploy Client Configuration.
  3. Confirm the action and wait for the command to complete.
    This ensures the pyflink module is installed across all relevant hosts.
  4. Restart the SQL Stream Builder service if prompted or to apply the changes.