Cluster Connectivity Manager
Cloudera can communicate with Data Lake, Cloudera Data Hub clusters, Cloudera data services workload clusters, and on-prem classic clusters that are on private subnets. This communication occurs via the Cluster Connectivity Manager. Cluster Connectivity Manager is available for Cloudera deployments on AWS, Azure, and GCP.
Cluster Connectivity Manager enables the Cloudera Control Plane to communicate with workload clusters that do not expose public IPs. Communication takes place over private IPs without any inbound network access rules required, but Cloudera requires that clusters allow outbound connections to Cloudera Control Plane.
Cluster Connectivity Manager provides enhanced security for communication between customer workload clusters and the Cloudera Control Plane. A Cloudera environment supports public, private, and semi-private networks for running workloads. In a public network, Cloudera Control Plane initiates a connection to the nodes in a workload cluster; However, when using a private or semi-private environment, this option is not available due to the private nature of the subnet and some of the hosts. In such cases, Cluster Connectivity Manager is required to simplify the network configuration in the customer’s subnet.
Cluster Connectivity Manager implements an inverted proxy that initiates communication from the secure, private workload subnet to Cloudera Control Plane. With Cluster Connectivity Manager enabled, the traffic direction is reversed so that the private workload subnet does not require inbound access from Cloudera's network. In this setup, configuring security groups is not as critical as in the public network setup. All communication via Cluster Connectivity Manager is encrypted via TLS v1.2.
From a data security perspective, no data or metadata leaves the workload subnet. The Cluster Connectivity Manager connection is used to send control signals, logs and heartbeats, and communicate the health status of various components with the Cloudera Control Plane.
When deploying environments without public IPs, a mechanism for end users to connect to the Cloudera endpoints should already be established via a Direct Connection, VPN or some other network setup. In the background, the Cloudera Control Plane must also be able to communicate with the entities deployed in your private network.
Cluster Connectivity Manager was initially released as Cluster Connectivity Manager v1 and later Cluster Connectivity Manager v2 was released to replace it. While Cluster Connectivity Manager v1 establishes and uses a tunnel based on the SSH protocol, with Cluster Connectivity Manager v2 the connection is via HTTPS. All new environments created with Cloudera Runtime 7.2.6 or newer use Cluster Connectivity Manager v2. Existing environments and new environments created with Runtime older than 7.2.6 continue to use Cluster Connectivity Manager v1. All newly registered classic clusters use Cluster Connectivity Manager v2, but previously registered classic clusters continue to use Cluster Connectivity Manager v1.
The following diagram illustrates connectivity to a customer account without using Cluster Connectivity Manager:
Cluster Connectivity Manager v2
Cluster Connectivity Manager v2 agents deployed on FreeIPA nodes initiate an HTTPS connection to the Cloudera Control Plane. This connection is then used for all communication thereafter. Data Lake and Cloudera Data Hub instances receive connections from the Cloudera Control Plane via the agents deployed onto FreeIPA nodes. This is illustrated in the diagram below.
Cluster Connectivity Manager v2 also supports classic clusters. You can use Cloudera Replication Manager with your on-premise CDH, HDP, and Cloudera Base on premises clusters accessible via a private IPs to assist with data migration and synchronization to cloud storage by first registering your cluster using classic cluster registration.
When Cluster Connectivity Manager v2 is enabled, the traffic direction is reversed so the environment does not require inbound access from Cloudera’s network. Since in this setup, inbound traffic is only allowed on the private subnets, configuring security groups is not as critical as in the public IP mode outlined in the previous diagram; However, in case of bridged networks it may be useful to restrict access to a certain range of private IPs.
The following diagram illustrates connectivity to a customer account using Cluster Connectivity Managerv2:
Cluster Connectivity Manager v2 with Token Authentication
In the current scheme of things, the communication between agent and Cloudera Control Plane services uses a two-way SSL or client certificate based authentication mechanism.
In order to enable traffic inspection which could further pave the way for traffic monitoring and anomaly detection in traffic, the communication between agent and Cloudera Control Plane can optionally be configured to use a combination of TLS (to validate the server) and bespoke validation (to validate the client).
This approach does away with the client certificate based agent authentication on the Cloudera Control Plane side and instead uses request signing and authorization to validate incoming requests from the Cluster Connectivity Manager agent.
Cluster Connectivity Manager v1
The below diagram illustrates the Cloudera connectivity to a customer account with Cluster Connectivity Manager v1 enabled. Cluster Connectivity Manager v1 agents are deployed not only on the FreeIPA cluster (like in Cluster Connectivity Manager v2), but also on the Data Lake and Cloudera Data Hub. While Cluster Connectivity Manager v2 establishes a connection via HTTPS, Cluster Connectivity Manager v1 uses a tunnel based on the SSH protocol. Workload clusters initiate an SSH tunnel to the Cloudera control plane, which is then used for all communication thereafter.
Supported services
The following Cloudera services are supported by Cluster Connectivity Manager:
Cluster Connectivity Managerv2
Supports environments with Cloudera Runtime 7.2.6+
Cloudera service | AWS | Azure | GCP |
---|---|---|---|
Data Lake | GA | GA | GA |
FreeIPA | GA | GA | GA |
Cloudera Data Engineering | Preview | Preview | |
Cloudera Data Hub | GA | GA | GA |
Cloudera Data Warehouse | GA | GA | |
Cloudera DataFlow | GA | GA | |
Cloudera AI | GA | GA | |
Cloudera Operational Database | GA | GA | GA |
Cluster Connectivity Manager v1
Supports environments with Cloudera Runtime <7.2.6 and environments created prior to Cluster Connectivity Manager v2 GA.
Cloudera service | AWS | Azure | GCP |
---|---|---|---|
Data Lake | GA | GA | GA |
FreeIPA | GA | GA | GA |
Cloudera Data Engineering | |||
Cloudera Data Hub | GA | GA | GA |
Cloudera Data Warehouse | |||
Cloudera DataFlow | |||
Cloudera AI | |||
Cloudera Operational Database |
To learn more about Cluster Connectivity Manager, refer to the following documentation: