Azure EnvironmentsPDF version

Defining anonymization rules for Cloudera logs

Cloudera includes a set of default anonymization rules and allows you to define custom anonymization rules in order to remove sensitive information from Cloudera logs.

Use PCRE convention for writing custom anonymization rule patterns.

Anonymization rules are applied to the following logs:

Cloudera includes a set of default anonymization rules that anonymize the following:

Anonymization rule (PCRE) Replacement Description
\b([A-Za-z0-9]|[A-Za-z0-9][A-Za-z0-9\-\._]*[A-Za-z0-9])@(([A-Za-z0-9]|[A-Za-z][A-Za-z0-9\-]*[A-Za-z0-9])\.)+([A-Za-z0-9]|[A-Za-z0-9][A-Za-z0-9\-]*[A-Za-z0-9])\b email@redacted.host Email addresses
\d{4}[^\w]\d{4}[^\w]\d{4}[^\w]\d{4} XXXX-XXXX-XXXX-XXXX Credit card numbers
\d{3}[^\w]\d{2}[^\w]\d{4} XXX-XX-XXXX SSN
FPW\:\s+[\w|\W].* FPW: [REDACTED] FreeIPA (workload) password
cdpHashedPassword=.*['] [CDP PWD ATTRS REDACTED] Hashed FreeIPA (workload) password.

Use PCRE convention for writing anonymization rule patterns. For each pattern, come up with a replacement string.

You can define custom anonymization rules in Cloudera. The anonymization rules are only applied to environments created after the rules were added in Cloudera.

Required role: PowerUser

Steps

  1. Once you have created the rules, navigate to Cloudera web interface > Cloudera Management Console > Global Settings > Telemetry > Anonymization rules.

  2. Default rules are pre-populated.

  3. Click on New rule and add a pattern and replacement string for your rule. Repeat for multiple rules.

  4. Test the rules from the same page on the UI under Test rules:
    1. Under Input text paste an example text with sensitive content that should get anonymized by the rules that you added.
    2. Click Test all rules.
    3. The sensitive content should be removed nad replaced in the output printed in the Anonymized result text box.
  5. Click Save Changes.