CDP Private Cloud Base 7.1.8 Release Summary
Cloudera is pleased to announce the release of Cloudera Data Platform (CDP) Private Cloud (PvC) Base version 7.1.8 and Cloudera Manager version 7.7.1. This release introduces several new features focused on enterprise readiness, improved storage efficiency with Ozone, new features for Impala, Hive performance improvements, Operations Database rebase, Governance enhancements with HDFS/Schema Registry lineage, Ranger RMS improvements and comprehensive enhancements to our streaming platform.
The full list of new features is as follows:
Platform Support Enhancements
-
New OS Support
- CDP Private Cloud Base now supports RHEL/OEL 8.6 for x86
-
New DB Support
- CDP Private Cloud Base now supports Maria DB 10.6
General Feature Enhancements
-
Cloudera Manager
-
Better support for enterprise readiness with active-passive High Availability
-
Cloudera Manager now provides the capability for HDFS Less Clusters
-
Strong security with support for credential management with secure Embedded/Vault and the option to enable and disable
-
Easier install and upgrade as user/group creation is decoupled from applying permissions on parcels
-
Improved alerting capability with the ability to send alerts to different users based on severity
-
Easier operational capability eliminating SMON restarts when metric filter has been changed
-
-
SDX Enhancements
-
Support for mapping Sentry Privileges to Ranger Policies from CDH 5.x/6.x to CDP for Solr
-
Support for coarse URI checking in Hive Ranger Plugin provides significant performance gains for Hive URL locations with large number of folders and files
-
New usersync config in Ranger provides a more simple and easier way to maintain LDAP groups
-
Ranger Ozone integration plugin now supports recursive ACL check for subpaths and to provide multi-tenant support
-
Atlas now supports tracking of HDFS lineage when data moves from one directory in HDFS to another
-
Ranger RMS will now fully support grants at the DB level. Grants provided at the hive db level and hdfs perms are now propagated to the db dir, and all tables and partitions under it.
-
-
Data Warehouse Enhancements
-
Impala can now read full ACID tables and can identify and invalidate the catalog cache when an ACID table gets compacted. This allows customers to use Hive ACID tables with Impala provides better support for BI use cases
-
Impala now supports complex types with multiple UNNEST()s in the select list, arrays in select and view list. This allows workloads to be transitioned from other similar systems.
-
Impala now supports fine grained table refreshing at partition level events for transactional tables which improves performance. A new config incremental_refresh_acid is added, which can switch on/off the fine-grained table refreshing
-
Impala now support improved self-event detection for Events Processor thereby providing better consistency and reliability for processing catalog changes
-
Impala now extends support for non-ASCII UTF-8 characters, and will return results in string functions such as length that are consistent with Hive.
-
Impala will now support Kudu “ignore” operations like INSERT_IGNORE, UPDATE_IGNORE, and DELETE_IGNORE to handle cases where users want to ignore primary key errors thereby providing better user experience.
-
Hive now provides better operations management with rolling restart support for HMS during rolling upgrades
-
Hive now provides ACID improvements that include Hive ACID compaction observability, Hive3/Acid MERGE INSERT MAPPING on Hive 3 and additional fixes to improve operations like
- Faster CREATE TABLE (no-rename CTAS)
- Split update always & skip sorting insert rows
- Speed up Drop Table / Drop Partition{}
- Reduce HMS workload for ACID with improved read locks-
- Faster Sequence numbers
- Internal Table naming convention (TBL_ID or name)
- Cleaner operations that eliminate single transaction locks
-
Kudu will now fully support transparent data encryption with integration with Ranger KMS
-
Kudu now allows the number of hash buckets per range partition to be changed, both at table creation and alteration time thereby increasing write throughput and performance
-
-
Self Service Analytics
-
Hue now supports Spark SQL and combined with the autocomplete feature this now provides an easy interface for executing Spark SQL
-
Hue provides a SQL interface to Phoenix making it easy to query for operational insights
-
Hue now provides a native Query Processor that can index and read Hive query history, this provide DAS features in Hue
-
-
Operational Database
-
HBase is rebased to 2.4.6 with support for replication of Meta region replicas, enhanced regions normalizers and storefile tracking.
-
HBase now supports MCC (multi cluster client support) which allows the switch between single HBase clusters and Multi-HBase Client with limited code changes
-
-
Platform Enhancements
-
Platform now now supports Oracle 19c RAC for backend HA
-
Ozone now support balancer that easily allows the data to be equally distributed across all nodes
-
Ozone now supports atomic operations that allows easy handling of nested directories and subdirectories thereby greatly increasing the performance of deletes, moves and renames.
-
Ozone now supports erasure coding provide better TCO and at a minimum 1.3x storage space reduction
-
HDFS now adds support for running multiple standby NameNodes, which provides additional fault-tolerance
-
Ozone now integrates with CDP Replication Manager thereby providing replication and backup of data between Ozone - Ozone clusters.
-
Ozone now supports S3 Multi-tenancy that allows isolation of buckets and volumes for S3 use cases
-
Replication Manager now supports Hive ACID table replication capability for disaster recovery.
-
YARN compatibility enhancements
- Dynamic Queue Scheduling feature enables customers to dynamically change queue resource allocations
-
-
Streaming Enhancements
-
Kafka
- Kafka upgraded to 3.X
- Authentication against Kafka Brokers can now be done using OAuth workflows that support backends such as OpenID Connect
- Kafka rolling restarts can be done in multiple ways designed to provide options to ensure continued cluster operations by checking the health of the Broker after restarting it: Forceful restart, Serving Requests, Min-ISR meet, All Partitions Healthy
-
Streams Messaging Manager (SMM)
- Consumer Group Offsets can be be changed in the SMM UI without need to go to the CLI
- Multiple Replication Targets can now be seen in SMM Replications tab
- The amount of data returned in the Data Explorer has been optimized to improve performance with large messages
- Improved UI performance by optimizing partition information and not auto loading metrics it all cases
-
Streams Replication Manager (SRM)
- SRM can now collect metrics from multiple targets such as bi-directional replications to b e displayed on the SMM Replications Tab
- SRM can perform a remote query to collect replication metrics from disjoint clusters which can be displayed on the SMM Replications tab
- Additional Metrics have been added to the SRM REST API
-
Schema Registry
- JSON based Schemes are now available
- Native Import/Export capabilities over REST API have been added allowing for backup/restore actions and for Schema Registries to be sync between environments utilizing different backends
- The ability to change the Schema Registry default compatibility to another option then Backwards is now possible
- Authentication to the Schema Registry can now be done using OAuth workflows that support authentication backends such as OpenID Connect
- Topic Schemas can now be viewed directly in Atlas
-
Cruise Control
- Cruise Control upgraded to version 2.5
-
KConnect
- Stateless NiFi KConnector allowing for NiFi flows to be run in KConnect
- KConnect Enterprise Security enhancements for Authorization, Authentication, Secure Secrets Storage, and Ranger integration
- New KConnectors
- CDC Debezium KConnectors for PostgreSQL, MySQL, SQL Server, DB2, and Oracle
-