Use a non-transparent proxy with Cloudera Machine Learning on AWS environments

Cloudera Machine Learning can use non-transparent proxies if the environment is configured to use a network proxy in Management Console.

Enterprise customers frequently need to deploy Cloudera in a virtual network that does not have direct internet access. Specifically, the proxy server may be located in a different virtual network, in order to filter traffic for allowed domains or IPs.

Transparent and non-transparent network proxies differ in the following ways.

Transparent network proxy
  • Proxy is unknown to clients and requires no additional client configuration.
  • Usually, connections by way of transparent proxies are configured in route tables on your AWS VPC.
Non-transparent proxy
  • Clients are aware of non-transparent proxies and each client must be specifically configured to use the non-transparent proxy connection.
  • You pass connection or security information (username/password) along with the connection request sent by clients.

You can configure an AWS environment to use non-transparent proxy connections when activating environments for Cloudera Machine Learning.

Use a non-transparent proxy in a different VPC

If the customer wants to copy the hostname for the non-transparent proxy and the non-transparent proxy is configured in a different VPC, then Cloudera needs the CIDR of the non-transparent proxy to allow the inbound access. To configure this, in the Provision UI, select Use hostname for non-transparent proxy and enter the CIDR range in Inbound Proxy CIDR Ranges.