Installing a non-transparent proxy in a CML environment

If Cloudera Machine Learning (CML) is used in an air-gapped environment, a proxy configuration is not mandatory. If a non-transparent proxy is used, then certain endpoints must be added to the list of allowed endpoints for the proxy.

Configure the No Proxy value with the Classless Inter-Domain Routing (CIDR) ranges for the Nodes, POD CIDR, and Service CIDR. Any IP range for internal services with seamless internal network connectivity must be added in the No Proxy configuration. Specify these CIDR ranges in the configuration to ensure that the traffic destined for these ranges bypasses the proxy. Add comma-separated no-proxy configurations without any spaces between them.

If your CDP Private Cloud deployment uses a non-transparent network proxy, configure proxy hosts that the workloads can use for connections with CML workspaces. You can configure the proxy configuration values from the Management Console.

The procedure for updating these settings might be different and dependent on the proxy server software used.

  1. Sign in to the CDP console.
  2. Click Management Console.
  3. On the Management Console home page, select Administration > Networks to view the Networks page.
  4. Configure the following options for the proxy values:
    Table 1. Proxy values
    Field Description
    HTTPS proxy It is the HTTP or HTTPS proxy connection string used with the CML workspaces. You must specify this connection string in the form: http(s)://[***USERNAME***]:[***PASSWORD***]@[***HOST***]:[***PORT***].

    The [***USERNAME***] and [***PASSWORD***] parameters are optional. You can specify the connection proxy string without these parameters.

    HTTP proxy It is the HTTP or HTTPS proxy connection string used with the CML workspaces. You must specify this connection string in the form: http(s)://[***USERNAME***]:[***PASSWORD***]@[***HOST***]:[***PORT***].

    The [***USERNAME***] and [***PASSWORD***] parameters are optional. You can specify the connection proxy string without these parameters.

    No proxy

    This is a comma-separated list of hostnames, IP addresses, or hostnames and IP addresses that should not be accessed through the specified HTTPS or HTTP proxy URLs.

    In case of ECS deployments, you must include no-proxy URLs for the following:

    • All the ECS hosts in your deployment
    • Any CDP Private Cloud Base cluster that you want to access
    • CIDR IP addresses for internal operations in the ECS cluster: 10.42.0.0/16 and 10.43.0.0/16
  5. Click Save.
  6. Ensure that the following endpoint is allowed:
    Table 2. Endpoint details
    Description CDP Service Destination Protocol and authentication IP protocol/ Port Comments
    Accelerators for ML Projects (AMPs) Machine Learning

    https://raw.githubusercontent.com

    https://github.com

    HTTPS TCP/443 Files for AMPs are hosted on GitHub.
    Additionally, ensure that the proxy server's allowlist includes the following specific URLs, which requires updates to the proxy server configuration:
    • CML workspace URL, for example: ml-samplexxxx.host-1.proxy.kcloud.cloudera.com
    • CDP console URL, for example consoles.ml-samplexxxx.apps.host-1.proxy.kcloud.cloudera.com
    • External registry if used in CML
Consider the following example:
Figure 1. NTP Proxy configuration example