Column masking with Ranger policies

You can use Apache Ranger dynamic column masking capabilities to protect sensitive data in Hive and Impala Virtual Warehouses in near real-time. You can set policies that mask or anonymize sensitive data columns (such as PII, PCI, and PHI) dynamically from Hive or Impala query output. For example, you can mask sensitive data within a column to show only the first or last four characters.

Dynamic column masking policies are similar to other Ranger access policies. You can set filters for specific users, groups, and conditions. With dynamic column-level masking, sensitive information never leaves Hive, and no changes are required at the consuming application or the Hive layer. There is also no need to produce additional protected duplicate versions of datasets.

  1. In Service Manager, select Policies.
  2. Select the Masking tab, then click Add New Policy.
  3. In Create Policy, add information for the column-masking filter as described in Ranger documentation, observing Impala limitations if you are using Impala instead of Hive to query Iceberg tables:
  4. To move a condition in the Mask Conditions list (and therefore change the order in which it is evaluated), click the dotted rows icon at the left of the condition row, then drag the condition to a new position in the list.
  5. Click Add to add the new column masking filter policy.