Creating a Python virtual environment resource

After you have created the requirements.txt file, you can create the Python virtual environment resource.

Before you begin

  • Download and configure the CDE CLI.
  • Create a requirements.txt file specifying the Python package and version dependencies required by your CDE job.
  • Ensure that the following hostnames are reacheable from within the cluster, to install the Python package successfully if no PyPi mirror is configured:
    • pypi.python.org
    • pypi.org
    • pythonhosted.org
    • files.pythonhosted.org

Steps

  1. Run the cde resource create command as follows to create a Python virtual environment resource.
    cde resource create --name <***RESOURCE_NAME***> --type <***ENVIRONMENT_RESOURCE_TYPE***>
    For example:
    cde resource create --name cde-python-env-resource --type python-env
    • [Optional] You can specify the custom pip repository using the --pip-repository-url <***CUSTOM-PIP-REPOSITORY-URL***> --pip-repository-cert <***PATH-TO-PEM-FILE***> option in the create resource command.
    • [Optional] You can specify one or more extra custom pip repositories using the --extra-pip-repository-<***NUMBER***>-url --<***CUSTOM-PIP-REPOSITORY-URL***>-<***NUMBER***>-cert <***PATH-TO-PEM-FILE***> option in the create resource command. You can specify up to 10 extra pip repositories.
    Example of command with pip repository and extra pip repository:
    cde resource create --name cde-python-env-resource --type python-env --pip-repository-url https://pypi.example.com/simple --pip-repository-cert cert.pem --extra-pip-repository-1-url https://extra-pypi.example.com/simple --extra-pip-repository-1-cert extra-cert.pem
  2. Upload the requirements.txt file to the resource.
    cde resource upload --name cde-python-env-resource --local-path ${HOME}/requirements.txt

Result

When you first create a Python virtual environment resource, CDE builds the environment according to the requirements.txt file. During this build time, you cannot run a job associated with the virtual environment. You can check the status of the environment by running cde resource list-events --name <resource_name>. For example:

cde resource list-events --name cde-python-env-resource

The environment is ready when you see a message similar to the following:

  {
    "id": 4,
    "message": "Job pp-84kgdgf6-resource-builder-cde-python-env-resource-1634911572 succeeded, marking resource with ready status",
    "created": "2021-10-22T14:09:13Z"
  }

Before you begin

  • Create a requirements.txt file specifying the Python package and version dependencies required by your CDE job.
  • Ensure that the following hostnames are reacheable from within the cluster, to install the Python package successfully if no PyPi mirror is configured:
    • pypi.python.org
    • pypi.org
    • pythonhosted.org
    • files.pythonhosted.org

Steps

  1. In the Cloudera Data Platform (CDP) console, click the Data Engineering tile. The CDE Home page displays.
  2. Click Resources in the left navigation menu and then click Create Resource.
  3. Specify a resource name, and then select Python Environment from the Type drop-down menu.
  4. Choose the Python version for the environment and optionally specify the PyPi Mirror URL. The PyPi mirror must be accessible from the CDP environment.
  5. Click Create.
  6. Click Upload File and select the requirements.txt file from your local machine. You can also drag-and-drop the file to the outlined area on the page.

Result

The UI displays Building the resource... while the Python virtual environment is building. After the environment is built, the page displays the Python packages and versions included in the environment.