Understanding the Cloudera Data Catalog Profilers

Profilers create metadata annotations that summarize the content and shape characteristics of the data assets (such as distribution of values in a box plot or histogram).

The Cloudera Data Catalog profiler employs Kubernetes enabled job scheduling and runs profilers jobs on-demand.

Profiler Name Description
Cluster Sensitivity Profiler A sensitive data profiler- PII, PCI, HIPAA and others.
Ranger Audit Profiler A Ranger audit log summarizer.
Hive Column Profiler Provides summary statistics like Maximum, Minimum, Mean, Unique, and Null values at the Hive column level.