Cloudera Observability deployment architecture for CDP Public Cloud

Describes the components and architecture of a basic Cloudera Observability environment deployed in CDP Public Cloud.

A Cloudera Observability environment for CDP Public Cloud comprises the following:
  • Cloudera Environment, which is a secure and governed cloud service platform. The Cloudera Observability component services run in the Control Plane of the Cloudera Observability framework. Users access the Cloudera Observability web user interface from the web host server in this framework.
  • Working Environment, which contains your Workload Clusters in your Workload environments, such as Production, Development, and Staging.
  • Workload Cluster, which is one or more CDP clusters managed by Cloudera. Depending on your environment's Cloudera data service, Telemetry Publisher is associated with a cluster, virtual warehouse, or virtual cluster in a data service by either Cloudera Manager for a Data Hub service or by Databus WXM Client for CDE and CDW services.

The below Cloudera Observability Architecture for Public Cloud diagram shows the communication between Cloudera Observability and your workload clusters through either Telemetry Publisher or Databus WXM Client. Where, the Cloudera Observability service, including its main component services, run in the Cloudera Control Plane and the area on the right is your Working Environment that contains the clusters and services required to run your workload jobs and queries.

Cloudera Management Console (not shown) manages the clusters and services in each of your working environments.

Telemetry Publisher and Databus WXM Client collect and send diagnostic information about jobs and queries from your Workload Clusters to Cloudera Observability. To ensure that all data transfers are secure between your Workload Clusters and Cloudera Observability, Telemetry Publisher and Databus WXM Client communicate with Cloudera Observability’s endpoints using Transport Layer Security (TLS), as follows:

  1. When a job or query is completed, the Telemetry Publisher or Databus WXM Client connects to the Cloudera Observability service and asks to upload the workload diagnostic information. Once the request is authorized and verified, Cloudera Observability replies with a signed S3 URL that can then be used to upload the workload diagnostic information by Telemetry Publisher or Databus WXM Client.
  2. When the URL is received, Telemetry Publisher and Databus WXM Client perform a secure and direct protocol test using the Cloudera Observability S3 URL, before sending any diagnostic data.
Figure 1. Cloudera Observability Architecture for Public Cloud

The following diagram shows the services from which Telemetry Publisher collects diagnostic metrics in a Data Hub environment:
Figure 2. Cloudera Data Hub Environment

The following diagrams show the services from which Databus WXM Client (formally named Databus Producer) collects diagnostic metrics in a Cloudera Data Warehouse (CDW) and a Cloudera Data Engineering (CDE) working environment. Where, the Databus WXM Client continually communicates and checks for recently completed jobs and queries to see if there is any diagnostic data to transfer, such as task and event logs and job and query history files, with the Hive DDL History, LLAP History, and Impala History pollers (HiveDDLHistoryPoller, LlapHistoryPoller, and ImpalaHistoryPoller):
  • Figure 3. CDW Environment

  • Figure 4. CDE Environment