Add Cloudera Data Science Workbench as an Interpreter for PyCharm

In PyCharm, you can configure an SSH interpreter. Cloudera Data Science Workbench uses this method to connect to PyCharm and act as its interpreter.

Before you begin, ensure that the SSH endpoint for Cloudera Data Science Workbench is running on your local machine. These instructions were written for the Professional Edition of PyCharm Version 2019.1 and are meant as a starting point. If additional information is required, see the documentation for your version of PyCharm for specific instructions.
  1. Verify that the SSH endpoint for Cloudera Data Science Workbench is running with cdswctl. If the endpoint is not running, start it.
  2. Open PyCharm.
  3. Create a new project.
  4. Expand Project Interpreter and select Existing interpreter.
  5. Click on ... and select SSH Interpreter
  6. Select New server configuration and complete the fields:
    • Host: localhost
    • Port: <port_number>

      This is the port number provided by cdswctl.

    • Username: cdsw
  7. Select Key pair and complete the fields using the RSA private key that corresponds to the public key you added to the Remote Editing tab in the Cloudera Data Science Workbench web UI..
    For macOS users, you must add your RSA private key to your keychain. In a terminal window, run the following command:
    ssh-add -K <path to your private key>/<private_key>
  8. Complete the wizard. Based on the Python version you want to use, enter one of the following parameters:
    • For Python 2: /usr/local/bin/python
    • For Python 3: /usr/local/bin/python3
    You are returned to the New Project window. Existing interpreter is selected, and you should see the connection to Cloudera Data Science Workbench in the Interpreter field.
  9. In the Remote project location field, specify the following directory:
    /home/cdsw
  10. Create the project.