CDP Private Cloud Base 7.1.9 Release Summary
Cloudera is pleased to announce the release of Cloudera Data Platform (CDP) Private Cloud (PvC) Base version 7.1.9 and Cloudera Manager version 7.11.3. This release embraces new possibilities by introducing powerful Open Data Lakehouse analytics capabilities on CDP PvC Base for the first-time ever! Additionally, this long-term release (LTS) will undoubtedly elevate your experience by offering the highest standards of enterprise readiness through enhanced reliability, security and performance.
Main highlights of this release
-
This release introduces the future of data with Open Data Lakehouse–a modern architecture powered by Apache Iceberg that delivers data reliability and ease of data management. This innovative open-source standards-based table format revolutionizes large-scale data management. With built-in features like time-travel, schema evolution, and streamlined data discovery, Iceberg empowers data teams to enhance data lake management while upholding data integrity.
-
This release offers high-availability through Zero Downtime Upgrades (ZDU) to ensure minimal workflow disruptions by either eliminating or reducing lengthy downtimes. Customers can further elevate productivity through one-stage upgrades and auto upgrades of large clusters for better resiliency and efficiency of their business operations.
-
Apache Ozone storage empowers our customers’ deployments to be scalable enough to handle massive datasets. This release adds new Ozone capabilities such as snapshots, improved replication and quotas for volumes and buckets to facilitate adoption of modern Cloud Native Architectures.
Key Features for this release
-
Apache Iceberg’s unique blend of transactional reliability and columnar storage efficiency offers a new standard for handling data lake analytics modernizing lakes into the new data lakehouse architecture.
-
This release integrates Impala, Spark, Flink, and NiFi compute engines for concurrently accessing and processing Iceberg datasets. It delivers time travel capabilities, improved query performance, data governance and simplified data pipelines and data operations for enhanced agility in customer deployments. Additionally, federated data access across all these engines provides easy deployment of multiple diverse use cases on a single copy of data.
-
The release also simplifies adherence to General Data Protection Regulation (GDPR) and similar regulations by facilitating in-place schema evolution and ACID transactions on lake data.
-
Iceberg Replication enables customers to replicate table level data between Iceberg tables in the private cloud. This is for Iceberg V2 tables created using Spark running on HDFS storage.
-
-
High Availability features-Enterprise Readiness requires customer deployments and workloads to be reliable, highly available and agile. By harnessing high availability through the following features, this release demonstrates our commitment to customer growth and business operations:
-
Zero Downtime Upgrades (ZDU) meets our customers’ demand for high availability for cluster upgrades. Zero downtime upgrades are now supported for HDFS, HBase, Hive, Kudu, Kafka, Ranger, Ranger KMS, and YARN. Furthermore, reduced upgrade downtimes are now supported for Impala, Knox, Job history server, Oozie, Phoenix, Solr, Zeppelin, and Zookeeper.
-
High availability for Livy and Spark History Server allows multiple server instances to run in a cluster for maintaining uninterrupted services in production deployments.
-
-
Apache Ozone features and integrations–Continuing to unlock full potential for data-intensive applications, the following new features and capabilities for Ozone are being delivered via the release:
-
Ozone snapshots are now supported at volume and bucket level. Ozone recon improvements have been added to provide insights on key data management metrics and reports such as heatmap on filters, size or access. Additional insights include actionable capabilities such as the number of times data was accessed or deleted.
-
Ozone-Knox integration allows users to access Ozone data using Knox through standard protocols such as HDFS and S3.
-
Ozone-Hue integration allows Hue to connect and browse Ozone objects using a file browser. Additionally, it provides Hue support to run stored procedures and access query history with data on Ozone.
-
Ranger Resource Mapping Service (RMS) will support authorization for Ozone storage locations. RMS for Ozone co-exists with Hive-HDFS ACL sync and provides authorization for both HDFS and Ozone file systems.
-
-
Platform features–CDP Private Cloud Base offers privacy protection and data integrity between applications communicating over a network. The following platform features add new security measures and advanced encryption capabilities to create a more secure customer environment:
-
TLS1.2 protocol encryption provides secure connections created between hosts in a cluster such that TLS 1.2 is used between CDP services and backend databases.
-
Oracle TCP/IP using SSL (TCPS) support provides secured communication between PvC Base components and Oracle backend DB.
-
Support for the “noexec” option for “/tmp” filesystem allows the Cloudera stack to function with no exec enabled on /tmp to mitigate the security risk by disallowing users to run executable binaries from /tmp.
-
-
Operational Databases–Kafka has been rebased to version 3.4, Zookeeper has been rebased to 3.8 and Curator has been rebased to 5.4.0.
-
Security Features and Enhancements–The following security features and enhancements help manage comprehensive data security and policy enforcement for CDP PvC Base:
-
Ranger KMS will now be used to provide unified key management services for encryption in lieu of Key Trustee Server (KTS). Import of keys from KTS and NavEncrypt and automation of NavEncrypt nodes from old KTS server to Ranger KMS server will be supported so as to not compromise data encryption and security.
-
Knox HttpFS enhances perimeter security for accessing and transferring HDFS data.
-
Knox token authentication provides efficient and scalable user authentication using tokens which are easily rolled, renewed and revoked.
-
-
SDX features–For users and administrators to continue to enjoy the advantages of a shared data experience (SDX) and improve the platform and data governance, the following features are delivered in 7.1.9:
-
Ranger now supports high-availability for Ranger Tag Sync/User sync such that in an event of default host failure additional host can take over.
-
Ranger user sync now provides an option for customers to treat users/groups from multiple sync sources the same as updating group memberships.
-
Ranger import/export enhancement provides an API for roles which eliminates the need for manually creating roles and associating those with import/export policies.
-
Atlas **audit aging **reduces the existing audit data in the Atlas system which is based on the end user criteria and configuration changes that users can manage.
-
-
Compliance and reporting features–Supporting CGI standards and FIPS certification is essential for regulated industries and Government organizations. This release adds FIPS certification for RHEL 8.8 (with JDK 8) platform to support our customers in protecting their sensitive information and complying with their security measures. Furthermore, FIPS components support is extended to Apache Phoenix, NavEncrypt, and Ranger KMS.
-
Upgrade Support Matrix–This release provides an enhanced user experience by reducing upgrade complexities. The one-step upgrade support matrix for CDP PvC Base 7.1.9 is shown in Figure 1 below.
Figure 1: CDP PvC Base 7.1.9 Release Upgrade Support Matrix
New Platform and Database Support
-
Operating System Support: RHEL 9.1, Oracle UEK, SLES 15 SP4
-
Python: Python 3.8 and 3.9 versions are supported
Deprecation of Platforms, Databases and Features
-
Operating System: Ubuntu 18.04
-
Databases: Postgres 10, MariaDB 10.3, MariaDB 10.2, Oracle 12, MySQL 5.6
-
Other Features: Zeppelin, DAS, KTS