Cluster deployment logs

Cluster deployment logs gather the same diagnostic (system and service) logs that are collected into S3 or ABFS, but CDP sends the logs to Cloudera engineering and support for troubleshooting purposes. You can apply configurable redaction rules on any sensitive data.

Cluster deployment logs are disabled by default. You can change this setting during environment creation or after environment creation, though CDP will only collect logs for new deployments. You can also set a default behavior at the CDP account level, so that you won’t have to enable the setting every time in the wizard. The logs are only gathered during deployment, which is approximately the first 20 minutes of cluster creation.

You can configure anonymization rules for log collection at the CDP account level. The rules are a list of rule objects with two fields: a regex pattern (PCRE) as the value, and a replacement string. Configure these rules to hide any sensitive data. By default, the rules hide email addresses and card numbers. This default behavior is the same as Cloudera Manager diagnostics collection.

The anonymization options are available under the Shared Resources tab in the Environments interface:

Currently, cluster deployment logs are collected only for Data Lake and Data Hub cluster shapes.