CDP PVC Base new features
If you are a CDH or a HDP user, review the new features available in CDP Private Cloud Base, in addition to the features carried over to CDP from the CDH and HDP releases.
CDH to CDP
- Ranger 2.0
- Dynamic row filtering and column masking
-
Attribute-based access control and SparkSQL fine-grained access control
-
Sentry to Ranger migration tools
-
New RMS provide HDFS ACL Sync
- Atlas 2.0
-
Business Metadata support by providing entity model extensions
-
Bulk import of Business Metadata attribute associations and glossary terms
-
Enhanced basic search and filtered search
-
Multi-tenancy support and easier administration with enhanced UI
-
Data lineage and chain of custody
-
Advanced data discovery and business glossary
-
Navigator to atlas migration
-
Improved performance and scalability
-
Integrating Ozone with Apache Atlas
-
- Hive 3
- Hive-on-Tez for better ETL performance
-
Support for Atomicity, Consistency, Isolation, and Durability (ACID) transactions
-
Comprehensive ANSI 2016 SQL coverage
-
Support major performance improvements
-
Query result cache
-
Surrogate keys
-
Materialized views
-
Scheduled Queries, Rebuilding Materialized Views Automatically Using SQL
-
Auto-translation for Spark-Hive reads, no HWC session needed
-
Hive Warehouse Connector Spark direct reads
-
Authorization of external file writes from Spark
-
Improved CBO and vectorization coverage
- Ozone
- 10x scalability of HDFS
-
Support for Billion objects and S3 native support
-
Support dense data nodes
-
Fast restarts and easy maintenance
- HBase
- HBase-Spark connector
-
Re-architected Medium Size Objects (MOB) for better compaction and performance
-
Hue
- Gateway-based SSO with Knox
-
Support for Ranger KMS-Key Trustee Integration
- Kudu
- Fine grained Authorization with Ranger
-
Support for Knox
-
Operation enhancements with rolling restarts and automatic rebalancing
-
Lots of usability improvements
-
Addition of new data types like DATE, VARCHAR and support for HybridClock timestamp
- YARN
- New Yarn Queue Manager
-
Placement Rules enable you to submit jobs without specifying the queue name
-
Capacity Scheduler leverages Delay Scheduling to honor task locality constraints
-
Preemption allows applications of higher priority to preempt applications of lower priority
-
Same queue name under different hierarchies
-
Moving applications between queues
-
Yarn Absolute Mode support


- Flume to Cloudera Data Flow
- Navigator to Ranger/Atlas
- Sentry to Ranger
- KeytrusteeKMS to RangerKMS
- HSM KMS to Key HSM
- Hive-on-Spark/MR to Hive-on-Tez
- YARN Fairshare to YARN Capacity
- Spark 1.6 to Spark 2.4
- NavOpt to WorkloadXM
- Pig to Hive or Spark

HDP to CDP
- Cloudera Manager
- Virtual private clusters
-
Automated wire encryption setup
-
Fine-grained role-based access control (RBAC) for administrators
-
Streamlined maintenance workflows
- Solr 8.4
- Relevance-based text search over unstructured data (text, pdf, .jpg,.)
- Impala
- Better fit for Data Mart migration use cases (interactive, BI style queries)
- Ability to query high volumes of data ("big data") in large clusters
- Distributed queries in a cluster environment, for convenient scaling
- Integration with Kudu for fast data and Ranger for authorization policies
- Fast BI queries enable using a single system for big data processing and analytics, so customers avoid costly modeling and ETL to add analytics on to the data lake.
- Hue
- Built-in SQL editor with intelligent query auto complete
-
Share query, chart results, and download for any database
-
Easily search, glance, and import datasets or jobs
- Kudu
- Better ingest and query performance for fast changing / updateable data. Reporting with update support through Kudu and Impala
- Real-time and streaming applications with Kudu + Spark
- Time series analytics, event analytics and real time data warehouse best Querying Experience with the most intelligent autocompletes
- YARN
- Tools to transition to Capacity Scheduler
-
New Yarn Queue Manager
-
Capacity Scheduler leverages Delay Scheduling to honor task locality constraints
-
Preemption allows applications of higher priority to preempt applications of lower priority
-
Same queue name under different hierarchies
-
Moving applications between queues
-
Yarn Absolute Mode support
- Encryption
- Auto-TLS feature automates all the steps required to enable TLS encryption
-
Ranger KMS integration with Key Trustee Server to provide additional key provider storage
-
Encryption at rest with NavEncrypt