Overview

Cloudera Data Catalog helps you understand, manage, secure, and govern data assets across the enterprise, with a dashboard for quick navigation and shared terminology for consistent use. It covers datasets, search, and Atlas tags for organizing and discovering assets, plus profiler guidance for sensitive data detection, audit insights, and column statistics.

Cloudera Data Catalog Overview

About Cloudera Data Catalog

Cloudera Data Catalog is a service within Cloudera that enables you to understand, manage, secure, and govern data assets across the enterprise.

Cloudera Data Catalog Dashboard

The Dashboard provides quick access to vital service information at a glance, in the form of visual, actionable navigation for multiple operations.

Cloudera Data Catalog terminology

An overview of terminology used in Cloudera Data Catalog.

Datasets overview

A dataset is a group of assets that fit a set of search criteria so that you can manage and administer them collectively for specific business purposes.

Search overview

On the Cloudera Data Catalog Search page, select a data lake and enter a search string to view all the assets with details that contain the search string.

Atlas tags overview

Atlas tags enhance searchability by serving as metadata labels, either created manually or applied automatically through profiling tools.

Cloudera Data Catalog Profilers before 1.5.5 SP2

Cloudera Data Catalog Profilers

Profilers create metadata annotations that summarize the content and shape characteristics of the data assets (such as distribution of values in a box plot or histogram).

The Cluster Sensitivity Profiler

Automatically performs context and content inspection to detect various types of sensitive data and suggest suitable classifications or tags based on the type of sensitive content detected or discovered.

The Ranger Audit Profiler

Lets you see who has accessed which data from a forensic audit or compliance perspective, visualize access patterns, and identify anomalies in access patterns.

The Hive Column Profiler

Lets you view the shape or distribution characteristics of the columnar data within a Hive table based on the Hive Column Profiler.

Cloudera Data Catalog Profilers from 1.5.5 SP2

Cloudera Data Catalog Profilers

Profilers create metadata annotations that summarize the content and shape characteristics of the data assets (such as distribution of values in a box plot or histogram).

The Data Compliance Profiler

Automatically performs context and content inspection to detect various types of sensitive data and suggest suitable classifications or tags based on the type of sensitive content detected or discovered.

The Activity Profiler

Lets you see who has accessed which data from a forensic audit or compliance perspective, visualize access patterns, and identify anomalies in access patterns.

The Statistics Collector Profiler

Lets you view the shape or distribution characteristics of the columnar data within a Hive table based on the Hive Column Profiler.