Overview

Cloudera Data Catalog helps you understand, manage, secure, and govern data assets across the enterprise, with a dashboard for quick navigation and shared terminology for consistent use. It covers datasets, data sharing, search, and Atlas tags for organizing, sharing, and discovering assets, plus profiler guidance for VM-based and Compute Cluster enabled environments for sensitive data detection, audit insights, and column statistics.

Cloudera Data Catalog Overview

About Cloudera Data Catalog

Cloudera Data Catalog is a service within Cloudera that enables you to understand, manage, secure, and govern data assets across the enterprise.

Cloudera Data Catalog Dashboard

The Dashboard provides quick access to vital service information at a glance, in the form of visual, actionable navigation for multiple operations.

Cloudera Data Catalog terminology

An overview of terminology used in Cloudera Data Catalog.

Search overview

On the Cloudera Data Catalog Search page, select a data lake and enter a search string to view all the assets with details that contain the search string.

Datasets overview

A dataset is a group of assets that fit a set of search criteria so that you can manage and administer them collectively for specific business purposes.

Data sharing overview

Data sharing enables secure, self-service access to Iceberg tables for external users using the Iceberg REST Catalog.

Data sharing user interface

The data sharing interface enables Data Providers to manage logical data shares and external user access.

Bookmarks overview

Collaborate on datasets with ratings, comments, likes, and bookmarks to share insights across the enterprise.

Atlas tags overview

Atlas tags enhance searchability by serving as metadata labels, either created manually or applied automatically through profiling tools.

Cloudera Data Catalog Profilers in VM-based environments

Cloudera Data Catalog Profilers

Profilers create metadata annotations that summarize the content and shape characteristics of the data assets (such as distribution of values in a box plot or histogram).

The Cluster Sensitivity Profiler

Automatically performs context and content inspection to detect various types of sensitive data and suggest suitable classifications or tags based on the type of sensitive content detected or discovered.

The Ranger Audit Profiler

Lets you see who has accessed which data from a forensic audit or compliance perspective, visualize access patterns, and identify anomalies in access patterns.

The Hive Column Profiler

Lets you view the shape or distribution characteristics of the columnar data within a Hive table based on the Hive Column Profiler.

Cloudera Data Catalog Profilers in Compute Cluster enabled environments

Cloudera Data Catalog Profilers

Profilers create metadata annotations that summarize the content and shape characteristics of the data assets (such as distribution of values in a box plot or histogram).

The Data Compliance Profiler

Automatically performs context and content inspection to detect various types of sensitive data and suggest suitable classifications or tags based on the type of sensitive content detected or discovered.

The Activity Profiler

Lets you see who has accessed which data from a forensic audit or compliance perspective, visualize access patterns, and identify anomalies in access patterns.

The Statistics Collector Profiler

Lets you view the shape or distribution characteristics of the columnar data within a Hive table based on the Hive Column Profiler.