Dynamic tag-based column masking in Hive with Ranger policies
Where Ranger resource-based masking policy for Hive anonymizes data from a Hive column identified by the database, table, and column, tag-based masking policy anonymizes Hive column data based on tags and tag attribute values associated with Hive column (usually specified as metadata classification in Atlas).
The following conditions apply when using Ranger column masking policies to mask data returned in Hive query results:
-
A variety of masking types are available, such as show last 4 characters, show first 4 characters, Hash, Nullify, and date masks (show only year).
-
You can specify a masking type for specific users, groups, and conditions.
-
Wildcard matching is not supported.
-
If there are multiple tag masking policies applied to the same Hive column, the masking policy with the lexicographically smallest policy-name is chosen for enforcement, E.G., policy "a" is enforced before policy "aa".
-
Masks are evaluated in the order listed in the policy.
-
An audit log entry is generated each time a masking policy is applied to a column.