Configuring a SOCKS Proxy for Amazon EC2

In AWS, the security group that you create and specify for your EC2 instances functions as a firewall to prevent unwanted access to your cluster and Cloudera Manager. For security, Cloudera recommends that you not configure security groups to allow internet access to your instances on the instances' public IP addresses. Instead, connect to your cluster and to Cloudera Manager using a SOCKS proxy server. A SOCKS proxy server allows a client (such as your web browser) to connect directly and securely to a server (such as your Cloudera Director server web UI) and, from there, to the web UIs on other IP addresses and ports in the same subnet, including the Cloudera Manager and Hue web UIs. The SOCKS proxy provides you with access to the Cloudera Director UI, Cloudera Manager UI, Hue UI, and any other cluster web UIs without exposing their ports outside the subnet.

To set up a SOCKS proxy for your web browser, follow the steps below.

Step 1: Set Up a SOCKS Proxy Server with SSH

Set up a SOCKS proxy server with SSH to access the EC2 instance running Cloudera Director. For example, run the following command (with your instance information):

nohup ssh -i
        "your-key-file.pem" -CND 8157
        ec2-user@instance_running_director_server &

where

  • nohup (optional) is a POSIX command to ignore the HUP (hangup) signal so that the proxy process is not terminated automatically if the terminal process is later terminated.
  • your-key-file.pem is the private key you used to create the EC2 instance where Cloudera Director is running.
  • C sets up compression.
  • N suppresses any command execution once established.
  • D 8157 sets up the SOCKS 5 proxy on the port. (The port number 8157 in this example is arbitrary, but must match the port number you specify in your browser configuration in the next step.)
  • ec2-user is the AMI username for the EC2 instance where Cloudera Director is running. The AMI username can be found in the details for the instance displayed in the AWS Management Console on the Instances page under the Usage Instructions tab.
  • instance_running_director_server is the private IP address of the EC2 instance running Cloudera Director server, if your networking configuration provides access to it, or its public IP address if not.
  • & (optional) causes the SSH connection to run as an operating system background process, independent of the command shell. (Without the &, you leave your terminal open while the proxy server is running and use another terminal window to issue other commands.)

Step 2: Configure Your Browser to Use the Proxy

On Google Chrome

By default, Google Chrome uses system-wide proxy settings on a per-profile basis. To get around that you can start Chrome using the command line and specify the following:
  • The SOCKS proxy port to use (must be the same value used in step 1)
  • The profile to use (this example creates a new profile)

This creates a new profile and launches a new instance of Chrome that does not interfere with any currently running instance.

Linux
/usr/bin/google-chrome \
--user-data-dir="$HOME/chrome-with-proxy" \
--proxy-server="socks5://localhost:8157"
Mac OS X
"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" \
--user-data-dir="$HOME/chrome-with-proxy" \
--proxy-server="socks5://localhost:8157"
Microsoft Windows
"C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" ^
--user-data-dir="%USERPROFILE%\chrome-with-proxy" ^
--proxy-server="socks5://localhost:8157"

Now in this Chrome session, you can connect to any Cloudera Director–accessible host using the private IP address or internal fully qualified domain name (FQDN). For example, when you connect to the Cloudera Director server, Cloudera Manager server, or Hue UI server, the browser actually connects to the proxy server, which performs the SSH tunneling.

Setting Up SwitchyOmega on the Google Chrome Browser

If you use Google Chrome, and especially if you use multiple proxies, the SwitchyOmega browser extension is a convenient tool to configure and manage all of your proxies in one place and switch from one proxy to another.

  1. Open Google Chrome and go to Chrome Extensions.
  2. Search for Proxy SwitchyOmega and add to it Chrome.
  3. In the Profiles menu of the SwitchyOmega Options screen, click New profile and do the following:
    1. In the Profile Name field, enter AWS-Cloudera.
    2. Select the type PAC Profile.
    3. The proxy autoconfig (PAC) script contains the rules required for Cloudera Director. Enter or copy the following into the PAC Script field:
      function regExpMatch(url, pattern) {    
        try { return new RegExp(pattern).test(url); } catch(ex) { return false; }    
      }
        
      function FindProxyForURL(url, host) {
          // Important: replace 172.31 below with the proper prefix for your VPC subnet
      
          if (shExpMatch(url, "*172.31.*")) return "SOCKS5 localhost:8157";
          if (shExpMatch(url, "*ec2*.amazonaws.com*")) return 'SOCKS5 localhost:8157';
          if (shExpMatch(url, "*.compute.internal*") || shExpMatch(url, "*://compute.internal*")) return 'SOCKS5 localhost:8157';
          if (shExpMatch(url, "*ec2.internal*")) return 'SOCKS5 localhost:8157';
          return 'DIRECT';
      }
  4. In the Actions menu, click Apply Changes.
  5. On the Chrome toolbar, select the AWS-Cloudera profile for SwitchyOmega.

You are now ready to deploy Cloudera Manager and CDH.