What's new

This section lists major features and updates for the Data Catalog service.

Cloudera Data Catalog is now additionally supported in the EU and APAC regional Control Plane. You can now ensure that you manage, secure, collaborate, and govern data assets across multiple clusters and environments within the EU region or APAC region where your organisation operates as per the data protection regulatory requirements.

December 18, 2024

This release (2.0.28) of the service introduces the following new changes:

Column name based tagging in

You can override the sampling to profile data in a column based on the column name matching a preset regular expression pattern instead matching the certain percentage of the columns values. This can be used for assets with skewed proportions where relying on the sampling would not result in correct tagging.

For more information, see Cluster Sensitivity Profiler configuration and Setting up column name based tagging.

June 03, 2024

This release of the Data Catalog service introduces the following new changes. Data Catalog is a service within Cloudera Data Platform that enables you to understand, manage, secure, and govern data assets across enterprise data lakes. Data Catalog helps you understand data across multiple clusters and across multiple environments.

Data Catalog introduces the following changes:

This release only contains fixes and updates to prepare Data Catalog for the changes in the upcoming 3.0.0 release.

May 16, 2024

This release of the Data Catalog service introduces the following new features. Data Catalog is a service within Cloudera Data Platform that enables you to understand, manage, secure, and govern data assets across enterprise data lakes. Data Catalog helps you understand data across multiple clusters and across multiple environments.

Data Catalog introduces the following additions:

Iceberg tables are now supported by the Data Catalog service:

  • You are able to filter for them in the Search page.
  • Iceberg tables can be viewed in the Asset Details page.
  • Iceberg tables can be added to a dataset.
  • All subcomponents of Data Catalog support JDK 17.

April 3, 2024

This release of the Data Catalog service provides you with a notable behavior change which you must note and act accordingly.

While upgrading your cluster from Cloudera Runtime version 7.2.17 to 7.2.18, and specifically during the OS upgrade step, the cluster goes into the failure state. The following message is seen:

__NODE_FAILURE:

New node(s) could not be added to the cluster. Reason Please find more details on Cloudera Manager UI. Failed command(s): Start(id=1546339088): Failed to start role profc6cf3856-PROFILER_SCHEDULER_AGENT-484032cb8f17cacf9e684efe50 of service profiler_scheduler in cluster cdp-dc-profilers-258395ef._

Impact on Data Catalog profilers:

If the Data Hub is not created, then the Data Catalog profilers will not be created in Cloudera Runtime 7.2.18 version.

To overcome this scenario, you must use the following process to bring up the Data Catalog profilers in the Cloudera Runtime 7.2.18 version.

First you must delete your existing 7.2.17 clusters. For more information, see Deleting profiler cluster.

Next, after you upgrade to the 7.2.18 Data Lake, then you can launch the Data Catalog profilers. For more information, see Launch profiler cluster.