Creating a Python virtual environment resource

After you have created the requirements.txt file, you can create the Python virtual environment resource.

Before you begin

  • Download and configure the CDE CLI.
  • Create a requirements.txt file specifying the Python package and version dependencies required by your CDE job.
  • Ensure that the following hostnames are reacheable from within the cluster, to install the Python package successfully if no PyPi mirror is configured:
    • pypi.python.org
    • pypi.org
    • pythonhosted.org
    • files.pythonhosted.org

Steps

  1. Run the cde resource create command as follows to create a Python virtual environment resource.
    cde resource create --name cde-python-env-resource --type python-env --python-version python3
  2. Upload the requirements.txt file to the resource.
    cde resource upload --name cde-python-env-resource --local-path ${HOME}/requirements.txt

Result

When you first create a Python virtual environment resource, CDE builds the environment according to the requirements.txt file. During this build time, you cannot run a job associated with the virtual environment. You can check the status of the environment by running cde resource list-events --name <resource_name>. For example:

cde resource list-events --name cde-python-env-resource

The environment is ready when you see a message similar to the following:

  {
    "id": 4,
    "message": "Job pp-84kgdgf6-resource-builder-cde-python-env-resource-1634911572 succeeded, marking resource with ready status",
    "created": "2021-10-22T14:09:13Z"
  }
Before you begin
  • Create a requirements.txt file specifying the Python package and version dependencies required by your CDE job.
  • Ensure that the following hostnames are reacheable from within the cluster, to install the Python package successfully if no PyPi mirror is configured:
    • pypi.python.org
    • pypi.org
    • pythonhosted.org
    • files.pythonhosted.org

Steps

  1. In the Cloudera Data Platform (CDP) management console, click the Data Engineering tile and click Overview.
  2. In the CDE Services column, select the service containing the virtual cluster where you want to create the Python virtual environment.
  3. In the Virtual Clusters column on the right, click the View Jobs icon for the virtual cluster where you want to create the Python virtual environment.
  4. Click Resources in the left menu.
  5. Click Create Resource at the top right.
  6. Specify a resource name, and then select Python Environment from the Type drop-down menu.
  7. Choose the Python version for the environment and optionally specify the PyPi Mirror URL. The PyPi mirror must be accessible from the CDP environment.
  8. Click Create.
  9. Click Upload File and select the requirements.txt file from your local machine. You can also drag-and-drop the file to the outlined area on the page.

Result

The UI displays Building the resource... while the Python virtual environment is building. After the environment is built, the page displays the Python packages and versions included in the environment.