Cluster Connectivity Manager
Cloudera can communicate with Data Lake, Cloudera Data Hub clusters, Cloudera data services workload clusters, and on-prem classic clusters that are on private subnets. This communication occurs via the Cluster Connectivity Manager. Cluster Connectivity Manager is available for Cloudera deployments on AWS, Azure, and GCP.
Cluster Connectivity Manager enables the Cloudera Control Plane to communicate with workload clusters that do not expose public IPs. Communication takes place over private IPs without any inbound network access rules required, but Cloudera requires that clusters allow outbound connections to Cloudera Control Plane.
Cluster Connectivity Manager provides enhanced security for communication between customer workload clusters and the Cloudera Control Plane. A Cloudera environment supports public, private, and semi-private networks for running workloads. In a public network, Cloudera Control Plane initiates a connection to the nodes in a workload cluster; However, when using a private or semi-private environment, this option is not available due to the private nature of the subnet and some of the hosts. In such cases, Cluster Connectivity Manager is required to simplify the network configuration in the customer’s subnet.
Cluster Connectivity Manager implements an inverted proxy that initiates communication from the secure, private workload subnet to Cloudera Control Plane. With Cluster Connectivity Manager enabled, the traffic direction is reversed so that the private workload subnet does not require inbound access from Cloudera's network. In this setup, configuring security groups is not as critical as in the public network setup. All communication via Cluster Connectivity Manager is encrypted via TLS v1.2.
From a data security perspective, no data or metadata leaves the workload subnet. The Cluster Connectivity Manager connection is used to send control signals, logs and heartbeats, and communicate the health status of various components with the Cloudera Control Plane.
When deploying environments without public IPs, a mechanism for end users to connect to the Cloudera endpoints should already be established via a Direct Connection, VPN or some other network setup. In the background, the Cloudera Control Plane must also be able to communicate with the entities deployed in your private network.
The following diagram illustrates connectivity to a customer account without using Cluster Connectivity Manager:
Cluster Connectivity Manager v2
Cluster Connectivity Manager v2 agents deployed on FreeIPA nodes initiate an HTTPS connection to the Cloudera Control Plane. This connection is then used for all communication thereafter. Data Lake and Cloudera Data Hub instances receive connections from the Cloudera Control Plane via the agents deployed onto FreeIPA nodes. This is illustrated in the diagram below.
Cluster Connectivity Manager v2 also supports classic clusters. You can use Cloudera Replication Manager with your on-premise CDH, HDP, and Cloudera Base on premises clusters accessible via a private IPs to assist with data migration and synchronization to cloud storage by first registering your cluster using classic cluster registration.
When Cluster Connectivity Manager v2 is enabled, the traffic direction is reversed so the environment does not require inbound access from Cloudera’s network. Since in this setup, inbound traffic is only allowed on the private subnets, configuring security groups is not as critical as in the public IP mode outlined in the previous diagram; However, in case of bridged networks it may be useful to restrict access to a certain range of private IPs.
The following diagram illustrates connectivity to a customer account using Cluster Connectivity Managerv2:
Cluster Connectivity Manager v2 with Token Authentication
In the current scheme of things, the communication between agent and Cloudera Control Plane services uses a two-way SSL or client certificate based authentication mechanism.
In order to enable traffic inspection which could further pave the way for traffic monitoring and anomaly detection in traffic, the communication between agent and Cloudera Control Plane can optionally be configured to use a combination of TLS (to validate the server) and bespoke validation (to validate the client).
This approach does away with the client certificate based agent authentication on the Cloudera Control Plane side and instead uses request signing and authorization to validate incoming requests from the Cluster Connectivity Manager agent.
Supported services
The following Cloudera services are supported by Cluster Connectivity Manager:
Cluster Connectivity Managerv2
Supports environments with Cloudera Runtime 7.2.6+
| Cloudera service | AWS | Azure | GCP |
|---|---|---|---|
| Data Lake | GA | GA | GA |
| FreeIPA | GA | GA | GA |
| Cloudera Data Engineering | GA | GA | |
| Cloudera Data Hub | GA | GA | GA |
| Cloudera Data Warehouse | GA | GA | |
| Cloudera DataFlow | GA | GA | |
| Cloudera AI | GA | GA | |
| Cloudera Operational Database | GA | GA | GA |
To learn more about Cluster Connectivity Manager, refer to the following documentation:
