Disabling telemetry on existing Cloudera Data Hub clusters
This topic describes how to disable telemetry for Cloudera Data Hub
clusters created and registered with telemetry enablement.
By disabling telemetry for Cloudera Data Hub clusters, the collection of diagnostic data is
disabled for Cloudera environments for a Cloudera Data Hub service.
Verify that you have root SSH access to the Cloudera Manager Server node using a Cloudbreak SSH key pair.
Verify that you have a Cloudera user
account with the Cloudera Manager Full Administrator role.
Disabling workload analytics
If you have enabled workload analytics to send diagnostic information about job and
query execution to Cloudera Observability for Cloudera Data Hub
clusters created in any environment, then you must disable it before disabling
Telemetry Publisher.
For the whole tenant:
From the Cloudera web
interface, navigate to Cloudera Management Console > Global Setings > Telemetry, turn off the Enable Workload
Analytics option.
For a specific environment only:
During environment creation from the Cloudera web interface, turn
off the Enable Workload Analytics option
under Logs Storage and Audits in the
environment creation wizard.
For an existing environment, from environment details > Telemetry, turn off the Enable Workload
Analytics option.
The environment-level setting overrides the tenant-level setting.
Log in to Cloudera Manager, and verify that you have the Full
Administrator role for the Cloudera Data Hub cluster that
requires disabling telemetry by performing the following actions:
In a terminal, access the Cloudera Manager Server
node with SSH using the Cloudbreak SSH key pair.
Grant the Workload user Full Administration privileges by running the
following commands, replacing YOUR_CSSO_USER_HERE
with the user that is performing the telemetry set up in Cloudera Manager, known as the Workload user:
# Switch to root
sudo -i
# Retrieve PostgreSQL credentials
export CM_SERVER_DB_FILE=/etc/cloudera-scm-server/db.properties
export CM_DB_HOST=$(awk -F"=" '/db.host/ {print $NF}' ${CM_SERVER_DB_FILE})
export CM_DB_NAME=$(awk -F"=" '/db.name/ {print $NF}' ${CM_SERVER_DB_FILE})
export CM_DB_USER=$(awk -F"=" '/db.user/ {print $NF}' ${CM_SERVER_DB_FILE})
export PGPASSWORD=$(awk -F"=" '/db.password/ {print $NF}' ${CM_SERVER_DB_FILE})
# Open psql
psql -h ${CM_DB_HOST} -U ${CM_DB_USER} -d ${CM_DB_NAME}
# Execute the following query
INSERT INTO user_auth_roles SELECT user_id, auth_role_id FROM users, auth_roles WHERE users.user_name=YOUR_CSSO_USER_HERE AND auth_roles.name='ROLE_ADMIN';
# To quit psql, type \q
To apply your changes, restart the Cloudera Manager
server by running the following commands:
From the Cloudera Management Services page, remove the
Telemetry Publisher role on the Cloudera Manager Server node
by performing the following actions:
In Cloudera Manager, select
Clusters and then locate and select
Cloudera Management Service.
From the Status Summary section, select the
Telemetry Publisher role.
From the Actions menu, select Stop
this Telemetry Publisher.
A confirmation message is displayed, click Stop this
Telemetry Publisher. After it is successfully stopped,
click Close.
Telemetry publisher stops sending telemetry payload to the
backend.
Remove the Telemetry Publisher configuration by performing the following
actions:
In Cloudera Manager, select
Clusters, locate and select Cloudera Management Service, and then
select the Configuration tab.
Search for the Telemetry Publisher Advanced Configuration
Snippet (Safety Valve) for telemetrypublisher.conf
property and disable the following properties:
#telemetry.upload.job.logs=true
#databus.header.sdx.id=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
#databus.header.sdx.name=YOUR_DATALAKE_NAME
#cluster.type=DATALAKE
#databus.header.environment.crn=[***The Cloud Resource Name (CRN) of the environment.***]
#databus.header.environment.name=[***The name of the your Cloudera environment.***]
#databus.header.datalake.crn=[***The Cloud Resource Name (CRN) of the Data Lake in your system.***]
#databus.header.datalake.name=[***The name assigned to the Data Lake.***] For example, PrimaryDataLake or AnalyticsLake.
#databus.header.datahub.crn=[***The Cloud Resource Name (CRN) of the Data Hub.***] Retrieve it from your Data Hub's setup or Cloudera Management Console.
#databus.header.datahub.name=[***The name of the Data Hub instance.***]For example, DefaultDataHub.
#databus.header.cloudprovider.name=[***The name of the cloud provider.***] This can be AWS or Azure.
#databus.header.cloudprovider.region=[***The region where the cloud resources are deployed.***] Such as us-west-2 or us-east-1.
Click Save Changes.
Telemetry is disabled from the existing Cloudera Data Hub
cluster.