CDP security FAQs

This topic covers frequently asked questions related to the communication between the CDP Control Plane and workload subnets.

Data security within workload subnets and virtual networks

CDP provides comprehensive security features to ensure that minimal network configuration is needed on workload clusters and that only metrics, logs, and control signals go in and out of the customer's network. No data from the workload clusters is ever accessed by the CDP Control Plane.

To operate at the highest level of security, Cloudera recommends running CDP workload clusters in private subnets. This means that nodes in the workload clusters do not have public IP addresses and all outbound traffic to the internet goes through a gateway or a firewall. This allows your security operations team to ensure that hosts that are allowed to communicate with the clusters are legitimate and pose no security threat to the assets. A common best practice is to avoid inbound connections of any sort directly into the workload subnet.

Cloudera recommends that no inbound connections be allowed into the private network and that you use your sec-ops approved method(s) to provide outbound access to the list of hosts specified in the AWS Azure GCP outbound network access destinations.

Communication between CDP and workload subnets

Customers use CDP Management Console to operate and manage workload clusters running in their own VPCs. For every operation performed through the Management Console on the workload clusters, a control signal is sent to the hosts in the private network. This is achieved through a feature called Cluster Connectivity Manager (CCM).

CCM eliminates the need to configure inbound connections into your secure private network. CCM, which is set up during cluster creation, configures a reverse tunnel from the customer’s private network to the CDP Control Plane. All control signals to create/delete clusters, stop/start environments and other management actions related to workload clusters go through the CCM tunnel. CCM allows customers to avoid configuring any inbound connections/routes to their private workload subnets, thus providing a better security posture for their public cloud assets.

More information on CCM is available in the Cluster Connectivity Manager documentation.

Egress network setup for better control and visibility on outbound traffic

Large security-conscious enterprises typically inspect all traffic going in and out of their private networks. This is generally achieved in the following way:

  1. Identify a single egress virtual network that is used to provide internet access to all other subnets and virtual networks. Route outbound (Internet) traffic from all other subnets to this egress network.

  2. Purpose-built technologies, such as web proxies, next-generation firewalls and cloud access security brokers are deployed in the egress VPC to monitor for anomalous outbound behavior. Network analyzers and forensic recorders can also be used.

Cloudera recommends a similar topology to configure outbound traffic rules that are needed for normal operation of the clusters in a private network. The alternative is to set up egress rules within the same VPC that hosts the private subnet.

Information exchanged between CDP and workload subnets

Irrespective of the security posture used by the customer to secure their assets in the workload subnet, a minimum set of communications is needed for normal operations of a CDP environment. The following set of points summarize the data exchanged between CDP Control Plane and workload subnets:

  1. All user actions on the CDP Control Plane that are related to interacting with the workload clusters in the workload subnets happen via CCM.

  2. Metering and billing information is sent at regular intervals from the customer’s workload subnet to the Cloudera Databus as specified in the AWS Azure GCP outbound network access destinations.

  3. If used, an optional component called Cloudera Observability also leverages the Cloudera Databus to communicate with the Control Plane.

  4. Diagnostic bundles for troubleshooting are generally sent by customers to engage in support cases. The diagnostic bundles can either be sent on demand or on a scheduled basis to Cloudera Customer Support. This feature also uses the Cloudera Databus channel. Refer to Sending Usage and Diagnostic Data to Cloudera page to read more about diagnostic bundles.

  5. Customers generally share logs with Customer Support in a self-service fashion through CDP Control Plane. No customer data is ever shared in the diagnostics bundle. To ensure sensitive data does not show up inadvertently in the logs, we recommend that customers follow the directions specified in Redaction of Sensitive Information from Diagnostic Bundles.