Fields used for defining anonymization rules
To define anonymization rules, use the following fields:
Field | Description |
---|---|
name | Provides a descriptive name for data anonymized by the rule. It has to be unique across all rules. |
description | Provides a description for the rule. |
rule_id |
Defines the class of rules the current rule belongs to. The supported rule IDs are: PATTERN, PROPERTY, XPATH, JSONPATH. This parameter is case-insensitive. |
patterns |
Defines a list of data patterns to be anonymized. It is applicable only to Pattern rules, where rule_id=PATTERN. These patterns are matched in a case-insensitive manner, which means that
the following pattern
|
extract |
Specifies a pattern to extract data matched through the list of patterns. The extract pattern is matched in a case-insensitive manner. For example, in order to anonymize the oozie.https.keystore.pass password, the following pattern and extract values are used:
This pattern is matched with values such as
The extract pattern is used to extract and anonymize only the values after
the If the extract pattern is not configured, the entire value matched with
the pattern is anonymized (which in this example is
|
properties | Specifies a list of property name patterns to anonymize; these are case-insensitively matched. It is applicable only to Property rules. |
parentNode |
This field is applicable to property anonymization in XML files. It allows
you to define the parent node of the property that you want to anonymize. By
default, parentNode is set to <property> <name>fs.s3a.proxy.password</name> <value>Abc7j*4$aTh</value> <description>Password for authenticating with proxy server.</description> </property> For example, you can anonymize
<param> <name>main.ldapRealm.contextFactory.systemPassword</name> <value>pass</value> </param> The rule to anonymize the above content configures param as the root tag
"parentNode":
"param" :{ "name": "KNOX LDAP Password", "rule_id": "Property", "properties": ["main.ldapRealm.contextFactory.systemPassword"], "include_files": ["topologies/*.xml"], "action" : "REPLACE", "parentNode": "param", "replace_value": "Hidden" } |
action |
The supported actions are: ANONYMIZE, DELETE, REPLACE. The action value is not case sensitive, so Anonymize or delete are also accepted values. ANONYMIZE action encrypts the data using the key indicated by shared flag, DELETE deletes the data, and REPLACE replaces the data with a predefined value, which can be customized using replace_value. |
replace_value | This field is used by the REPLACE action to specify a replacement for the data to anonymize. The default value is Hidden. |
shared |
Indicates which key to use for anonymization (shared or private). This value is used when the anonymization action is set to ANONYMIZE. It is a boolean type property (true/false). If set to true - the Hortonworks support team can unmask data if needed for diagnostic purposes; for example, host names and IP addresses for resolving issues on specific hosts or communication between hosts. Note that unmasked data is not stored in Hortonworks repositories; it is discarded as soon as the analysis finishes. The default value is true. Rules configured with |
include_files | Specifies a list of glob file patterns for which the rule applies. If not configured, the rule is applicable to all files. |
exclude_files | Specifies a list of glob file patterns which are excluded from anonymization. If not configured, no file is excluded from the rule application. |
enabled | A flag (true/false) which specifies if the rule is enabled to be executed. By default, it is set to true. |