Creating an Extensible Engine with Conda

Cloudera Data Science Workbench recommends using pip for package management along with a requirements.txt file (as described in the previous section).

Cloudera Data Science Workbench also allows you to extend its base engine image to include packages of your choice using Conda. To create an extended engine:
  1. Add the following lines to a Dockerfile to extend the base engine, push the engine image to your Docker registry, and include the new engine in the allowlist for your project. For more details on this step, see Extensible Engines.
    Python 2
    RUN mkdir -p /opt/conda/envs/python2.7
    RUN conda install -y nbconvert python=2.7.11 -n python2.7
    Python 3
    RUN mkdir -p /opt/conda/envs/python3.6
    RUN conda install -y nbconvert python=3.6.1 -n python3.6
  2. Set the PYTHONPATH environmental variable as shown below. You can set this either globally in the site administrator dashboard, or for a specific project by going to the project's Settings > Engine page.
    Python 2
    PYTHONPATH=$PYTHONPATH:/opt/conda/envs/python2.7/lib/python2.7/site-packages
    Python 3
    PYTHONPATH=$PYTHONPATH:/opt/conda/envs/python3.6/lib/python3.6/site-packages