Configure Data Anonymization Rules
Anonymization rules define regular expression to anonymize sensitive data (like IP addresses, Domain Names, etc.). Each rule uses JSON format to define what to match and the value to replace.
Note | |
---|---|
Anonymization rule formats vary between different SmartSense versions. Make sure that you consult the documentation that matches your SmartSense version. |
To define regular expression based rules, refer to the following sample:
{ "name":"ip_address", "path":null, "pattern": "[ :\\/]?[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}[ :\\/]?", "extract": "[ :\\/]?([0-9\\.]+)[ :\\/]?", "shared": true }
Key reference:
name
- The rule namepath
- An optional regular expression path of files on which to apply this rule (default isnull
means all files)pattern
- Regular expression to defined the pattern to match within the fileextract
- An optional regular expression to extract the data from the matched pattern. Each of the extracts will be marked as regular expression group.shared
- Flag to indicate which key to use for anonymization the (shared
orprivate
) key to use for masking. If shared key is used, Hortonworks support team would be able to unmask data if needed for diagnostic purposes. For example hostname and IP addresses for resolving issues on specific hosts or communication between hosts. Please note, unmasked data is not stored in Hortonworks repositories. It is discarded as soon as the analysis finishes.value
- An optional constant value to replace. Note that the value chosen should notbe matchable by the pattern specified above. For example, if the pattern is '.*dfs.datanode.*', the value should not contain 'dfs.datanode'. Also, note that if the value is specified,shared
flag will be ignored.
To use property based rules use the following example:
{ "name":"delete_oozie_jdbc_password", "path":"oozie-site.xml", "property": "oozie.service.JPAService.jdbc.password", "operation":"DELETE" "shared": false }
name
- The rule namepath
- A regular expression path of files on which to apply this ruleproperty
- The name of a specific property within the matching filesoperation
- It can be eitherDELETE
orREPLACE
. Default isREPLACE
. IfDELETE
is specified, the property will be removed from the config file and ifREPLACE
is specified, the property value will be replace by either constantvalue
or masked value.value
- An optional value forREPLACE
operation. If not specified, private or shared key is used to mask the data to replace.enabled
- Flag to enable/disable rule definition, default being true.excludes
- A set of path patterns to be excluded by the rule. For example: “excludes”: [“oozie-site.xml”, “core-site.xml”]shared
- Flag to allow anonymized data to be reversed by Hortonworks. If shared is true anonymized data is reversible by Hortonworks, if false, that data cannot be reversed.
Note Rules configured with
shared = false
cannot be unmasked by Hortonworks (and in some cases may become a roadblock for support case analysis)