CDP Private Cloud Base 7.1.6 Release Summary

March 2021

Cloudera is pleased to announce the release of Cloudera Data Platform (CDP) Private Cloud (PvC) Base version 7.1.6 and Cloudera Manager version 7.3.1. These releases introduce a direct upgrade path from HDP 3 to CDP Private Cloud Base, add numerous enhancements to simplify the upgrade and migration paths from CDH 5 and HDP 2, and roll up all the prior maintenance enhancements from previous releases.

The full list of new features is as follows:

Upgrade Enhancements

  • HDP 3 Upgrades

    • HDP 3 customers can now upgrade their HDP 3.1.5 clusters directly to CDP Private Cloud Base. Full details of the upgrade procedure are provided here.

    • A new version of the AM2CM tool (1.2.0) has been released to support this upgrade.

  • Documented Rollback Procedure

    • Rollback procedures are available to support upgrades from CDH 5 to CDP 7.1.6

    • Rollback procedures are available to support upgrades from HDP 2 to CDP 7.1.6.

  • Accumulo Support

    • CDP Private Cloud Base now supports Operational Database (OpDB) powered by Apache Accumulo, based on Accumulo 2.0.

    • Accumulo 2.0 is the first version that supports semantic versioning for management and consistency, bulk import API that improves the time of data upload, simplified scripts for easier management and user experience improvements like table summaries and dedicated scan support.

    • This now enables CDH 5, HDP 2, and HDP 3 customers that utilize Accumulo to upgrade to CDP Private Cloud Base. More details here.

  • CDH 5 Upgrade Enhancements

    • YARN FS2CS Tool Enhancements for scheduler migration provides better scheduler transition for customers who are upgrading or migrating from CDH 5.13 - 5.16 with improved placement rules, a new placement rule evaluation engine, and a new weight mode for capacity allocation with flexible auto-queue creation for Capacity Scheduler.
  • YARN Upgrade Enhancements

    • YARN and YARN Queue Manager now support dynamic and auto-child-queue creation.

    • YARN Queue Manager now supports partitions and node labels - customers can now partition a cluster into sub-clusters and classify nodes using labels. This allows for jobs to be deployed to run on nodes with specific characteristics. In addition the Queue Manager UI can now be used to manage YARN partitions.

    • Enhanced Placement Rules for YARN queues - To resolve previous limitations, a new placement rules evaluation engine is introduced that supports the new JSON-based placement rules format.

    • Placement rules can now be created easily using new Queue Manager UI enhancements.

    • A new feature called Weight Mode has been introduced for YARN resource allocation, giving additional flexibility and easier migration from fair scheduler configurations.

Platform Support Enhancements

  • New OS Versions

    • CDP Private Cloud Base now supports RHEL/CENTOS 7.9 for Intel x86 and IBM PPC hardware.
  • New DB Versions

    • CDP Private Cloud Base now supports MySQL8 and Postgres12.

General Feature Enhancements

  • Cloudera Manager Enhancements (release 7.3.1)

    • Ranger audit can now be configured to use a local file system rather than HDFS for storage, enabling a broader range of cluster types including Kafka and NiFi to operate with full security and governance capabilities but without the additional resource/administrative overhead of HDFS.

    • Custom Kerberos principal support for streaming components: SRM, SMM, Cruise Control, Kafka Connect and Schema Registry. This enables flexible, externally-managed kerberos identities for a broader range of cluster types.

    • (De)commission steps can be defined as part of CSD services, enabling more seamless cluster up/down scaling and maintenance workflows when using services like Kafka, Ozone and any 3rd party software.

    • Service and role metrics collection supports gathering enumerated text values.

  • Transaction Support

    • Support for complex distributed transactions across rows and tables is now available using ANSI SQL semantics, providing familiarity to MySQL or PostgreSQL users. See this blog for details and benchmarking results.
  • Data Warehouse Enhancements

    • Implement and re-enable ROLE-related statements in Impala, allowing administrators to grant privileges to ROLES, and assign ROLES to GROUPS, delivering powerful permission control. See document for details.

    • Hive Warehouse Connector Simplification delivers a common configuration to specify the mode of operation (Spark Direct Reader or JDBC). Its use is fully transparent through spark.sql(“<query>”). For backward compatibility, the configurations used in earlier releases are still supported, but will be deprecated eventually. For details, see reading data through HWC.

    • Added support for the Impyla client, which lets developers submit SQL queries to Impala within Python programs. See document for details.

    • Kudu support for INSERT_IGNORE, UPDATE_IGNORE, and DELETE_IGNORE operations, simplifying client applications and improving ingest performance.

    • Faster cluster restart and rebalance for Kudu.

  • Object Store Enhancements

    • Ozone enhancements to support Kafka Connect, Atlas and Nifi sinks. Customers can now use Kafka connectors to write to Ozone without any modifications. Nifi sinks allow Nifi to use Ozone as the storage in a secure CDP Cluster. Atlas integrations provide lineage and data governance capability for data store in Ozone.

    • Trash support with Ozone now provides the capability to recover keys that may have been accidentally deleted. So using this feature customers can recover data that may have been accidentally deleted.

    • Ozone Multiraft protocol support enhances the speed of writes into the data pipelines thereby increasing write performance upto 30%.

  • Authorization & Audit Enhancements

    • Ranger Audit Filter (Tech Preview) - Using JSON defined filters in ranger repo config, administrators can limit which audit events are captured when accessed. This is especially useful for keeping audit logs relevant for services users that already have audits at a higher level. For example, an audit filter can be created to exclude service users’ activities (such as METADATA_OPERATION from hive) to reduce the audit volume and make the relevant end user audit events easily manageable.

    • Ranger Audit Access Improvements - Makes columns resizable and allows users to select columns that they want to see.

    • Improved performance for Hive-HDFS ACL sync.

Product Documentation Enhancements

The docs website now hosts a feedback tab on the lower-right edge of most pages for reader comments. Readers are asked “How can we improve?” and are invited to tell us what they like, how we can improve our content and content delivery, and what problems they are encountering. Feedback is routed directly to the content development team for fast action.

Additional Resources