Accessing Cloudera AI Workbenches via SOCKS Proxy

This topic describes how to configure a SOCKS proxy to access Cloudera AI Workbenches on non-publicly routable VPCs. A SOCKS proxy server allows your web browser to connect directly and securely to your Cloudera AI Workbenches without exposing their ports outside the subnet.

  1. In the non routable VPC, create an EC2 instance for your SOCKS server (for example, my-ec2-socks-server) with a public IP and an SSH key-pair (for example, my-key-file.pem).
    Use the AWS documentation to create the EC2 instance: Getting Started with Amazon EC2 Instances
    Depending on whether you want multiple users to share the SOCKS server or have everyone create their own server, pick the SSH key pair for the instance accordingly. More information is available in the AWS documentation: Amazon EC2 Key Pairs.
  2. Set up a SOCKS proxy server with SSH to access the EC2 instance, my-ec2-socks-server.
    nohup ssh -i
            "my-key-file.pem" -CND 8157
            ec2-user@<public ip for my-ec2-socks-server> &
    • nohup (optional) is a POSIX command to ignore the HUP (hangup) signal so that the proxy process is not terminated automatically if the terminal process is later terminated.
    • my-key-file.pem is the private key you used to create the EC2 instance where the SOCKS server is running.
    • C sets up compression.
    • N suppresses any command execution once established.
    • D 8157 sets up the SOCKS 5 proxy on the port. (The port number 8157 in this example is arbitrary, but must match the port number you specify in your browser configuration in the next step.)
    • ec2-user is the AMI username for the EC2 instance. The AMI username can be found in the details for the instance displayed in the AWS Management Console on the Instances page under the Usage Instructions tab.
    • <public ip for my-ec2-socks-server> is the public IP address of the EC2 instance running the SOCKS server.
    • & (optional) causes the SSH connection to run as an operating system background process, independent of the command shell. (Without the &, you leave your terminal open while the proxy server is running and use another terminal window to issue other commands.)
  3. Configure Your Browser to Use the Proxy. This example uses Google Chrome.

    By default, Google Chrome uses system-wide proxy settings on a per-profile basis. To get around that you can start Chrome using the command line and specify the following:

    • The SOCKS proxy port to use (must be the same value used in step 1)
    • The profile to use (this example creates a new profile)

    This creates a new profile and launches a new instance of Chrome that does not interfere with any currently running instance.

    • Linux
      /usr/bin/google-chrome \
      --user-data-dir="$HOME/chrome-with-proxy" \
      --proxy-server="socks5://localhost:8157"
    • MacOS
      "/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" \
      --user-data-dir="$HOME/chrome-with-proxy" \
      --proxy-server="socks5://localhost:8157"
    • Windows
      "C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" ^
      --user-data-dir="%USERPROFILE%\chrome-with-proxy" ^
      --proxy-server="socks5://localhost:8157"
You shall now be able to navigate to any Cloudera AI Workbench in the browser launched using SOCKS proxy.

When you connect to the Cloudera AI Workbench, the browser actually connects to the proxy server, which performs the required SSH tunneling.