Release Summaries

CDP Public Cloud: April 2021 Release Summary

Data Catalog introduces the following addition: Support for Business Glossary and usage of “terms” to associate datasets.

CDE users can now use API keys, managed using the CDP user management service (UMS), to interact with the CDE Jobs API using the command line.

Users can now submit jobs with raw Scala code, without compiling. These jobs run spark-shell to process the application file.

Admins can now access summary diagnostic logs directly on their local machine. A new Diagnostic page has been added to the CDE Service details to generate and download the bundle.

CDE services older than 90 days will have expired TLS certificates. A new action has been added to the CDE service hamburger menu to renew the certificates and avoid access issues for DE users.

GANG is a new resource scheduling policy that overcomes scale-up challenges in situations where high rates of job submission lead to queuing. The new scheduling policy moves jobs off the queue in batches. This clears up the queue, forces scale-up of nodes to process the burst of incoming jobs, and reduces wait and startup time of jobs. By default, GANG scheduling is disabled. It can be turned on for specific jobs by adding a new job-level configuration option.

docker.repository.cloudera.com/cloudera/cdv/cmldataviz:6.2.5-b25

[Preview] Use natural language search to find relevant information in your datasets through natural language statements, and render it in graphical format.

You can enable lockout after too many failed login attempts by adding AXES_ENABLED to the advanced site settings in CML or CDSW. This feature is disabled by default.

Improved Solr aggregation and faceting.

Fixed bugs for CSV and Excel downloads.

Error message popovers now close when clicking elsewhere on the page as expected.

Expansion on crosstab visuals now works as expected.

Changing values on a custom picklist filter in dashboard mode now works as expected.

Fixed a bug where filters on an Arcengine connection did not reset after unselecting.

Enable “Download as Image/PDF” switch in dashboard settings now works as expected.

Dashboards created from Arcengine connections can now be accelerated as expected.

Changing colors in a scatter visual now redraws the visual as expected.

When you restart a Virtual Warehouse, it takes a few minutes to start and change to the “Running’ state. In the older CDW version, you had to wait to launch the Hue app until the Virtual Warehouse is up and running. Now you can open Hue and run queries even when it is in the “Starting” state.

This is the first release of Cloudera DataFlow.

  • Business User Experience - A new user role, ML BusinessUser, provides restricted access to view Applications created in CML.

  • ML Runtimes - Runtimes provide a lightweight alternative to Engines.

  • View All Applications - Admins can view all applications on the application list page.

  • Jobs improvements - Job Scheduling UI now supports cron expressions, job notification email subject line now includes the project name, and the Job History page now shows detailed start and end timestamps.

  • New AWS region - CML now supports the AWS region eu-south-1.

DSE-15059 - Fixed an issue where project creators may not be able to authenticate to models created by project collaborators using model API key.

Runtime 7.2.9 is now available and can be used for registering an environment with a 7.2.9 Data Lake and creating Data Hub clusters. See Cloudera Runtime.

GCP quick start is now available, allowing you to quickly set up a CDP environment. See GCP quick start.

See What’s New Post.

The Cluster Definitions page that used to be available in the Shared Resources section was removed. Instead, you can access all cluster definitions related to a specific environment from the Cluster Definitions tab available in the environment’s details. You can save new cluster definitions using the Save As New Definition option available from the Create Data Hub wizard or from CDP CLI using the cdp datahub create-cluster-definition command.

The option to specify the Ranger Audit role (AWS) managed identity (Azure) or service account (GCP) during environment registration was moved from the Logs - Storage and Audit section to the Data Access section. Consequently, these sections were renamed to Logs and Data Access and Audit.

From the Hardware tab of the Data Lake details, you can click the Repair icon to select specific nodes within a host group to repair.

See What’s New Post.

The IAM policy for the provisioning credential has been updated to include new permissions related to load balancers. The following permissions are now required:

cloudformation:UpdateStack
cloudformation:ListStackResources
elasticloadbalancing:DescribeLoadBalancers
elasticloadbalancing:DescribeTargetHealth
elasticloadbalancing:RegisterTargets
elasticloadbalancing:DeregisterTargets

If you are using a restricted IAM policy for your provisioning credential, you must add these additional permissions.

The following publications were moved to the CDP Public Cloud library:

  • Getting Started in CDP Public Cloud
  • AWS/Azure/GCP Requirements
  • AWS/Azure/GCP Quick Starts
  • CDP Public Cloud Security Overview

This CDP Public Cloud library is accessible via the Get Started link on the docs homepage or via the CDP landing page.

The documentation that was moved is available from the following links:

URL redirects were added temporarily; They will eventually be removed, so make sure to update your bookmarks.

This change was made in effort to make CDP Public Cloud onboarding documentation easier to find. The previous location of this content (in the Management Console library) was unintuitive to many users.

The AWS/Azure/GCP requirements content was consolidated in one place in the CDP Public Cloud library mentioned above. If the content that you have bookmarked throws a 404 error, it is most likely in one of the following publications:

To fix the error, you have three options:

  • Update the URL by replacing /management-console/cloud/environments-/ with /cdp-public-cloud/cloud/requirements-/, replacing with "aws", "azure", or "gcp". This works for the content that was moved, but not for topics that were consolidated into other documentation and removed.

  • On the docs homepage, search the website for the content that moved. Search results will direct you to the correct location.

  • Navigate to one of the libraries linked above and find the content that you are looking for.

This change was made in effort to consolidate all documentation related to cloud provider requirements in one place. Previously, the documentation was scattered and users had to click on many links in order to find content.

You can now register CDP Private Cloud Base clusters as classic clusters in CDP.

  • The CDP Private Cloud Base clusters can be registered via Cloudera Manager for use in Replication Manager.

  • Additionally, you can register CDP Private Cloud Base clusters via Knox for use in Data Catalog. This is a technical preview feature that should not be used in a production environment.

For documentation, see Adding a CDP Private Cloud Base cluster.

See What’s New Post.

No new features

You can register the CDP Private Cloud Base as a classic cluster using the Cloudera Manager option in CDP. After registration, you can replicate the data in HDFS and Hive external tables in the classic cluster to CDP Public Cloud.

No new features