Configuring a SOCKS Proxy for Amazon EC2

In AWS, the security group that you create and specify for your EC2 instances functions as a firewall to prevent unwanted access to your cluster and Cloudera Manager. For security purposes, Cloudera recommends that you do not configure security groups to allow internet access to your instances on their public IP addresses. Instead, Cloudera recommends that you connect to your cluster and to Cloudera Manager using a SOCKS proxy server. A SOCKS proxy server allows a client (such as your web browser) to connect directly and securely to a server (such as your Cloudera Director server web UI) and, from there, to the web UIs on other IP addresses and ports in the same subnet, including the Cloudera Manager and Hue web UIs. So, the SOCKS proxy provides access to the Cloudera Director UI, Cloudera Manager UI, Hue UI, and any other cluster web UIs without exposing their ports outside the subnet.

To set up a SOCKS proxy for your web browser, follow the steps below.

Step 1: Set Up a SOCKS Proxy Server with SSH

Set up a SOCKS proxy server with SSH to access the EC2 instance running Cloudera Director. For example, run the following command (with your instance information):

nohup ssh -i "your-key-file.pem" -CND 8157 ec2-user@instance_running_director_server &


  • nohup (optional) is a POSIX command to ignore the HUP (hangup) signal so that the proxy process is not terminated automatically if the terminal process is later terminated.
  • your-key-file.pem is the private key you used to create the EC2 instance where Cloudera Director is running.
  • C sets up compression.
  • N suppresses any command execution once established.
  • D 8157 sets up the SOCKS 5 proxy on the port. (The port number 8157 in this example is arbitrary, but must match the port number you specify in your browser configuration in the next step.)
  • ec2-user is the AMI username for the EC2 instance where Cloudera Director is running. The AMI username can be found in the details for the instance displayed in the AWS Management Console on the Instances page under the Usage Instructions tab.
  • instance_running_director_server is the private IP address of the EC2 instance running Cloudera Director server, if your networking configuration provides access to it, or its public IP address if not.
  • & (optional) causes the SSH connection to run as an operating system background process, independent of the command shell. (Without the &, you would leave your terminal open while the proxy server is running and use another terminal window to issue other commands.)

Step 2: Configure Your Browser to Use the Proxy

Next, configure your browser settings to use the SOCKS proxy.

On Google Chrome

By default, Google Chrome uses system-wide proxy settings on a per-profile basis. To get around that you can launch Chrome via the command line and specify the following:
  • The SOCKS proxy port to use (this must be the same value used above)
  • The profile to use (this example will create a new profile)

This will create a new profile and launch a new instance of Chrome that won’t interfere with your current running instance of Chrome.

/usr/bin/google-chrome \
--user-data-dir="$HOME/chrome-with-proxy" \
Mac OS X
"/Applications/Google Chrome" \
--user-data-dir="$HOME/chrome-with-proxy" \
Microsoft Windows
"C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" ^
--user-data-dir="%USERPROFILE%\chrome-with-proxy" ^

Now in this Chrome session you can connect to any Cloudera Director accessible host using the private IP address or internal FQDN. For example, when you ask the browser to connect to the Cloudera Director server, Cloudera Manager server, or Hue UI server, the browser will actually connect to the proxy server, which takes care of the SSH tunneling.

Setting Up SwitchyOmega on the Google Chrome Browser

If you are using Google Chrome, and especially if you use multiple proxies, the SwitchyOmega browser extension is a convenient tool for configuring and managing all of your proxies in one place and for switching from one proxy to another.

  1. Open Google Chrome and go to Chrome Extensions.
  2. Search for Proxy SwitchyOmega and add to it Chrome.
  3. In the Profiles menu of the SwitchyOmega Options screen, click New profile and do the following:
    1. In the Profile Name field, enter AWS-Cloudera.
    2. Select the type PAC Profile.
    3. The proxy autoconfig (PAC) script contains the rules required for Cloudera Director. Enter or copy the following into the PAC Script field:
      function regExpMatch(url, pattern) {    
        try { return new RegExp(pattern).test(url); } catch(ex) { return false; }    
      function FindProxyForURL(url, host) {
          // Important: replace 172.31 below with the proper prefix for your VPC subnet
          if (shExpMatch(url, "*172.31.*")) return "SOCKS5 localhost:8157";
          if (shExpMatch(url, "*ec2**")) return 'SOCKS5 localhost:8157';
          if (shExpMatch(url, "*.compute.internal*") || shExpMatch(url, "*://compute.internal*")) return 'SOCKS5 localhost:8157';
          if (shExpMatch(url, "*ec2.internal*")) return 'SOCKS5 localhost:8157';
          return 'DIRECT';
  4. In the Actions menu, click Apply Changes.
  5. On the Chrome toolbar, select the AWS-Cloudera profile for SwitchyOmega.

You are now ready to deploy Cloudera Manager and CDH.