Create a Dockerfile for the New Custom Image
The first step when building a customized image is to create a Dockerfile that specifies which packages you would like to install in addition to the base image.
When creating the Dockerfile you must delete the Cloudera repository that is inaccessible because of the paywall by running the following:
RUN rm /etc/apt/sources.list.d/*
For example, the following Dockerfile installs the
beautifulsoup4
package on top of the base Ubuntu
image that ships with Cloudera Data Science Workbench.
# Dockerfile
# Specify a Cloudera Data Science Workbench base image
FROM docker.repository.cloudera.com/cdsw/engine:8
RUN rm /etc/apt/sources.list.d/*
# Update packages on the base image and install beautifulsoup4
RUN apt-get update
RUN pip install beautifulsoup4 && pip3 install beautifulsoup4