Managing Profilers

The Data Catalog profiler engine runs data profiling operations as a pipeline on data located in multiple data lakes. These profilers create metadata annotations that summarize the content and shape characteristics of the data assets.

Table 1. List of built-in profilers
Profiler Name Description
Cluster Sensitivity Profiler A sensitive data profiler- PII, PCI, HIPAA, etc.
Ranger Audit Profiler A Ranger audit log summarizer.
Hive Column Profiler Provides summary statistics like Maximum, Minimum, Mean, Unique, and Null values at the Hive column level.