New Features and Changes in Cloudera Manager 6.2.0
The following sections describe new and changed features for Cloudera Manager 6.2.0:
- Virtual Private Clusters - Separation of Compute and Storage services
- Ubuntu 18 Support
- Backup and Disaster Recovery (BDR)
- Hosts
- Installation
- Licensing
- Cloudera Manager API
- Kafka Configuration and Monitoring
- Hive Server 2
- delegation.token.master.key Generation
- New Warning for Hue Advanced Configuration Snippet
- Increased Default Value for dfs.client.block.write.locateFollowingBlock.retries configuration
- Support GPU Scheduling and Isolation for YARN
- Health Test for Erasure Coding Policies
- Disk Caching Configurations in Spark Service
- Decimal Support for Sqoop Clients
- TLS
- Apply Auto-TLS Configuration to Existing Services
- HTTP Strict Transport Security
- Support for TLS proto/ciphers in Custom Service Descriptors (CSD)
- Expose the configurations to use TLS encryption to the Hive Metastore Database on the Hive Metastore (Hive) Configurations Page
- Enable Auto-TLS Globally
- Kafka/Flume Auto-TLS enhancements
- License Enforcement - Auto TLS
- Custom certificates for Cloudera Manager Certificate Authority (CMCA)
Virtual Private Clusters - Separation of Compute and Storage services
A Virtual Private Cluster uses the Cloudera Shared Data Experience (SDX) to simplify deployment of both on-premise and cloud-based applications and enable workloads running in different clusters to securely and flexibly share data.
A new type of cluster is available in CDH 6.2, called a Compute cluster. A Compute cluster runs computational services such as Impala, Spark, or YARN but you configure these services to access data hosted in another Regular CDH cluster, called the Base cluster. Using this architecture you can separate compute and storage resources in a variety of ways to flexibly maximize resources.
Ubuntu 18 Support
Support for Ubuntu 18.04 has been added for Cloudera Manager and CDH 6.2 and higher.
Cloudera Issue: OPSAPS-48410
Backup and Disaster Recovery (BDR)
Hive Direct Replication to S3/ADLS Backed Cluster
BDR now supports Hive direct replication from on-premise to S3/ADLS clusters and metadata replication to the Hive Metastore.
Using a single replication process, BDR enables Hive data to be pulled from HDFS to S3/ADLS clusters and use the "Hive-on-cloud" mode, where the target Hive Metastore updates the table locations to point to S3/ADLS clusters. This process facilitates easy data migration and synchronisation between the cloud and on-premise clusters.
For more information, see Hive/Impala Replication.
Replication to and from ADLS Gen2
You can now replicate HDFS files and Hive data to and from Microsoft ADLS Gen2. To use ADLS Gen2 as the source or destination, you must add Azure credentials to Cloudera Manager. Note that the URI format for ADLS Gen2 is not the same as ADLS Gen1. For ADLS Gen2 use the following URI format: abfs[s]://<file_system>@<account_name>.dfs.core.windows.net/<path>/.
Hosts
Duplicate Host Detection and Hostname Migration
Cloudera Manager now detects and rejects duplicate hosts from joining a cluster and gracefully tolerates > changes in hostnames for managed hosts, better supporting automated deployments
Installation
Accumulo Initialization
An Initialize Accumulo checkbox now displays in the Installation wizard.
Cloudera Issue: OPSAPS-48619
JDBC URL for the Hive Metastore Database Connection
You can now specify a JDBC URL when establishing a connection from the Hive service to a supported backend database (MySQL, PostgreSQL, or OracleDB). Enter the JDBC URL on the Setup Database page in the Create Cluster and Create Service wizards in Cloudera Manager.
Cloudera Issue: OPSAPS-48668
Licensing
Start and Deactivation Dates for Cloudera Enterprise Licenses
Cloudera Enterprise licenses now include a start date and a deactivation date. Enterprise-only features are enabled on the start date and will be disabled after the deactivation date. If you install the license before the start date, a banner displays in the Cloudera Manager Admin console showing the number of days until the license becomes effective.
Cloudera Issue: OPSAPS-47500
Enhanced License Enforcement - Node Limit
When an Enterprise license expires, Cloudera Manager reverts to the Express version. This includes enforcing a maximum of 100 nodes across all CDH 6 clusters.
Cloudera Issue: OPSAPS-48611
Enhanced License Enforcement - Feature Availability
Features only available with a Cloudera Enterprise license are turned off after the deactivation date has passed. For legacy licenses that do not have a deactivation date, the features are turned off on the expiration date.
Cloudera Issue: OPSAPS-46864
Enhanced License Enforcement - KMS Configuration
Cloudera Manager will not allow KMS configuration changes after the deactivation date specified in the new license file although the KMS will remain functional. For legacy licenses, the deactivation date defaults to the expiration date specified in the license.
Cloudera Issue: OPSAPS-48501
Cloudera Manager API
Cross-Cluster Network Bandwidth Test
Cloudera Manager now has an API to test network bandwidth between clusters, helping determine if the infrastructure is suitable for separating storage and compute services.
API for Managing Expiring Cloudera Manager Sessions
There is a new Cloudera Manager API endpoint, /users/expireSessions/{UserName} that can be invoked by a user with the Full administrator or User administrator role that expires all of a particular user's active Cloudera Manager Admin console sessions - local or external. Please refer to the Cloudera Manager REST API documentation for more information.
Cloudera Issue: OPSAPS-43756
Service Type Information in the ApiServiceRef
The Cloudera Manager API endpoint ApiServiceRef now returns the service type. Please refer to the Cloudera Manager REST API documentation for more information.
Cloudera Issue: OPSAPS-48369
API to Emit All Features Available
{ ""owner"" : ""John Smith"", ""uuid"" : ""12c8052f-d78f-4a8e-bba4-a55a2d141fcc"", ""features"" : [ { ""name"" : ""PEERS"", ""description"" : ""Peers"" }, { ""name"" : ""BDR"", ""description"" : ""BDR"" }, { ""name"" : ""KERBEROS"", ""description"" : ""Kerberos"" }, . . .
Please refer to the Cloudera Manager REST API documentation for more information.
Cloudera Issue: OPSAPS-49060
New Name Attribute for ApiAuthRole
ApiAuthRole entities can now be specified and looked up with a name string for the role, as specified in the API documentation. Please refer to the Cloudera Manager REST API documentation for more information.
Cloudera Issue: OPSAPS-46780
Kafka Configuration and Monitoring
New Kafka Metrics
- kafka_topic_unclean_leader_election_enable_rate_and_time_ms
- kafka_incremental_fetch_session_evictions_rate -
- kafka_num_incremental_fetch_partitions_cached -
- kafka_num_incremental_fetch_sessions
- kafka_groups_completing_rebalance
- kafka_groups_dead
- kafka_groups_empty
- kafka_groups_preparing_rebalance
- kafka_groups_stable
- kafka_zookeeper_request_latency
- kafka_zookeeper_auth_failures
- kafka_zookeeper_disconnects
- kafka_zookeeper_expires
- kafka_zookeeper_read_only_connects
- kafka_zookeeper_sasl_authentications
- kafak_zookeeper_sync_connects
The following metric is deprecated: kafka_responses_being_sent
Cloudera Issue: OPSAPS-48911, OPSAPS-48798, OPSAPS-48311, OPSAPS-48656
Kafka Broker ID Display
Kafka Broker IDs are now displayed on the Cloudera Manager's Kafka Instances page.
Cloudera Issue: OPSAPS-44331
Kafka Topics in the diagnostic bundle
- kafka-topics --describe
- kafka-topics --list
Cloudera Issue: OPSAPS-36755
Kafka Configuration Properties for Delegation Tokens
- delegation.token.max.lifetime.ms
The token has a maximum lifetime beyond which it cannot be renewed anymore. Default value 7 days.
- Delegation.token.expiry.time.ms
The token validity time in seconds before the token needs to be renewed. Default value 1 day.
Cloudera Issue: OPSAPS-47051
Enhanced Security for Kafka in Zookeeper with ACLs
A new script, zookeeper-security-migration.sh script is now available to lock down Kafka data in Zookeeper. See Kafka Security Hardening with Zookeeper ACLs.
Cloudera Issue: OPSAPS-47988
Hive Server 2
New Graph for the Compilation Metrics
A new graph, Operations Awaiting Compilation for HiveServer2 compilation metrics has been added.
Cloudera Issue: OPSAPS-47506
Secured ADLS Credentials for HS2
ADLS credentials are now stored securely via Cloudera Manager for use with HS2. This enables multi-user Hive-with-ADLS clusters.
Learn more at Configuring ADLS Access Using Cloudera Manager.
Cloudera Issue: OPSAPS-49076
Secured S3 Credentials HS2 on S3
S3 credentials are now stored securely by Cloudera Manager for use with Hive. This enables multi-user Hive-on-S3 clusters.
Learn more at Configuring the Amazon S3 Connector.
The following sub-tasks are related to this feature:
- Distribute the path of the HDFS credential store file and decryption password to HS2
Adds job credstore path and decryption password propagation for HS2.
Cloudera Issue: OPSAPS-48662
- Manage an encrypted credential store in HDFS for HS2
Adds a job specific credstore for HS2.
Cloudera Issue: OPSAPS-48661
- Rotate the password and the encrypted credential file in HDFS on every HS2 restart
Adds password and credstore file rotation on every HS2 role restart.
Cloudera Issue: OPSAPS-48663
delegation.token.master.key Generation
delegation.token.master.key is now automatically generated by Cloudera Manager/.
Cloudera Issue: OPSAPS-48525
New Warning for Hue Advanced Configuration Snippet
Warnings will be emitted if the values for Hue Service Advanced Configuration Snippet or Hue Server Advanced Configuration Snippet are not formatted properly. For example, if it does not contain a configuration section like [desktop].
Cloudera Issue: OPSAPS-27606
Increased Default Value for dfs.client.block.write.locateFollowingBlock.retries configuration
The default value for the HDFS configuration dfs.client.block.write.locateFollowingBlock.retries configuration's has been changed from 5 to 7.
Cloudera Issue: OPSAPS-48170
Support GPU Scheduling and Isolation for YARN
Added support to enable usage of GPUs in YARN applications and for custom YARN resource types.
Cloudera Issue: OPSAPS-48685
Health Test for Erasure Coding Policies
A new Verify Erasure Coding Policies For Cluster Topology health test has been introduced. The health test fails with a yellow status if there are not enough data nodes or racks to support all enabled erasure coding policies.
Cloudera Issue: OPSAPS-48526
Disk Caching Configurations in Spark Service
Disk caching for the Spark History Server can now be enabled from Cloudera Manager.
Cloudera Issue: OPSAPS-48385
Decimal Support for Sqoop Clients
- Setting the following property to enable decimal support in Avro: sqoop.avro.logical_types.decimal.enable=true
- Setting the following properties to enable decimal support in Parquet:
sqoop.parquet.logical_types.decimal.enable=true
parquetjob.configurator.implementation=hadoop
Please note that changing any of these properties might break existing Sqoop jobs, or alter their output in a way that disrupts consumers further down the chain.
Cloudera Issue: OPSAPS-48938
TLS
Apply Auto-TLS Configuration to Existing Services
You can now use Auto-TLS to add TLS to an existing cluster. This functionality is available in both the Cloudera Manager Admin Console and by using the API. See Configuring TLS Encryption for Cloudera Manager and CDH Using Auto-TLS,
There is a new cluster Cloudera Manager API command ConfigureAutoTlsServices which will enable Auto-TLS for services in a single cluster. Please refer to the Cloudera Manager REST API documentation for more information.
Cloudera Issue: OPSAPS-47349
HTTP Strict Transport Security
When TLS is enabled for the Cloudera Manager Admin Console web requests now include the HTTP Strict-Transport-Security header. For more details about this header, see Strict-Transport-Security (Mozilla).
Cloudera Issue: OPSAPS-282290
Support for TLS proto/ciphers in Custom Service Descriptors (CSD)
Added the ability to specify the TLS protocol and the TLS cipher suites in CSDs.
Cloudera Issue: OPSAPS-48214
Expose the configurations to use TLS encryption to the Hive Metastore Database on the Hive Metastore (Hive) Configurations Page
Exposes properties that can be used to configure TLS from the Hive Metastore Server to the Hive Metastore Database. As a minimum configuration requirement, the Enable TLS/SSL to the Hive Metastore Database checkbox must be enabled. (The default value is disabled.) If the Hive Metastore TLS/SSL Client Truststore properties are provided, then those will be used. Otherwise, the default list of well-known certificate authorities will be used. Additionally, ability to provide a JDBC URL override to use when connecting to the database is also exposed. This will override all other values used to create the JDBC URL. This is an advanced configuration option and should only be used as a safety-valve.
Cloudera Issue: OPSAPS-48666
Enable Auto-TLS Globally
There is now a Cloudera Manager API command, GenerateCmcaCommand, which will enable Auto-TLS for an existing Cloudera Manager deployment. This command creates an internal Cloudera Manager Certificate Authority (CMCA) and certificates for all existing hosts. Please refer to the Cloudera Manager REST API documentation for more information.
Cloudera Issue: OPSAPS-43102
Kafka/Flume Auto-TLS enhancements
Flume now supports Auto-TLS when used with Kafka.
Cloudera Issue: OPSAPS-46339
License Enforcement - Auto TLS
Auto-TLS is not available when using a Trial license. To enable Auto-TLS, you must have an Enterprise license.
Cloudera Issue: OPSAPS-48981
Custom certificates for Cloudera Manager Certificate Authority (CMCA)
When using Auto-TLS with custom certificates, you can use the new AddCustomCerts command to add certificates associated with a hostname to the Auto-TLS certificate database. Please refer to the Cloudera Manager REST API documentation for more information. details.
Cloudera Issue: OPSAPS-48678