Creating a Python virtual environment resource
After you have created the requirements.txt
file,
you can create the Python virtual environment resource.
Before you begin
- Download and configure the CDE CLI.
- Create a
requirements.txt
file specifying the Python package and version dependencies required by your CDE job. - Ensure that the following hostnames are reacheable from within the cluster, to install
the Python package successfully if no PyPi mirror is configured:
- pypi.python.org
- pypi.org
- pythonhosted.org
- files.pythonhosted.org
Steps
-
Run the
cde resource create
command as follows to create a Python virtual environment resource.cde resource create --name <***RESOURCE_NAME***> --type <***ENVIRONMENT_RESOURCE_TYPE***>
For example:cde resource create --name cde-python-env-resource --type python-env
- [Optional] You can specify the custom pip repository using the --pip-repository-url <***CUSTOM-PIP-REPOSITORY-URL***> --pip-repository-cert <***PATH-TO-PEM-FILE***> option in the create resource command.
- [Optional] You can specify one or more extra custom pip repositories using the --extra-pip-repository-<***NUMBER***>-url --<***CUSTOM-PIP-REPOSITORY-URL***>-<***NUMBER***>-cert <***PATH-TO-PEM-FILE***> option in the create resource command. You can specify up to 10 extra pip repositories.
Example of command with pip repository and extra pip repository:cde resource create --name cde-python-env-resource --type python-env --pip-repository-url https://pypi.example.com/simple --pip-repository-cert cert.pem --extra-pip-repository-1-url https://extra-pypi.example.com/simple --extra-pip-repository-1-cert extra-cert.pem
- Upload the
requirements.txt
file to the resource.cde resource upload --name cde-python-env-resource --local-path ${HOME}/requirements.txt
Result
When you first create a Python virtual environment resource, CDE
builds the environment according to the
requirements.txt
file. During this build time, you
cannot run a job associated with the virtual environment. You can
check the status of the environment by running cde resource
list-events --name <resource_name>
.
For example:
cde resource list-events --name cde-python-env-resource
The environment is ready when you see a message similar to the following:
{
"id": 4,
"message": "Job pp-84kgdgf6-resource-builder-cde-python-env-resource-1634911572 succeeded, marking resource with ready status",
"created": "2021-10-22T14:09:13Z"
}
Before you begin
- Create a
requirements.txt
file specifying the Python package and version dependencies required by your CDE job. - Ensure that the following hostnames are reacheable from within the cluster, to install
the Python package successfully if no PyPi mirror is configured:
- pypi.python.org
- pypi.org
- pythonhosted.org
- files.pythonhosted.org
Steps
- In the Cloudera Data Platform (CDP) console, click the Data Engineering tile. The CDE Home page displays.
- Click Resources in the left navigation menu and then click Create Resource.
- Specify a resource name, and then select Python Environment from the Type drop-down menu.
- Choose the Python version for the environment and optionally specify the PyPi Mirror URL. The PyPi mirror must be accessible from the CDP environment.
- Click Create.
- Click Upload File and select the
requirements.txt
file from your local machine. You can also drag-and-drop the file to the outlined area on the page.
Result
The UI displays Building the resource... while the Python virtual environment is building. After the environment is built, the page displays the Python packages and versions included in the environment.