New Features and Changes in Cloudera Manager 5
The following sections describe what's new and changed in each Cloudera Manager 5 release.
What's New in Cloudera Manager 5
The following sections describe what is new in each Cloudera Manager 5 release.
- What's New in Cloudera Manager 5.16.2
- What's New in Cloudera Manager 5.16.1
- What's New in Cloudera Manager 5.15.2
- What's New in Cloudera Manager 5.15.1
- What's New in Cloudera Manager 5.15.0
- What's New in Cloudera Manager 5.14.4
- What's New in Cloudera Manager 5.14.3
- What's New in Cloudera Manager 5.14.2
- What's New in Cloudera Manager 5.14.1
- What's New in Cloudera Manager 5.14.0
- What's New in Cloudera Manager 5.13.3
- What's New in Cloudera Manager 5.13.2
- What's New in Cloudera Manager 5.13.1
- What's New in Cloudera Manager 5.13.0
- What's New in Cloudera Manager 5.12.2
- What's New in Cloudera Manager 5.12.1
- What's New in Cloudera Manager 5.12.0
- What's New in Cloudera Manager 5.11.2
- What's New in Cloudera Manager 5.11.1
- What's New in Cloudera Manager 5.11.0
- What's New in Cloudera Manager 5.10.2
- What's New in Cloudera Manager 5.10.1
- What's New in Cloudera Manager 5.10.0
- What's New in Cloudera Manager 5.9.3
- What's New in Cloudera Manager 5.9.2
- What's New in Cloudera Manager 5.9.1
- What's New in Cloudera Manager 5.9.0
- What's New in Cloudera Manager 5.8.5
- What's New in Cloudera Manager 5.8.4
- What's New in Cloudera Manager 5.8.3
- What's New in Cloudera Manager 5.8.2
- What's New in Cloudera Manager 5.8.1
- What's New in Cloudera Manager 5.8.0
- What's New in Cloudera Manager 5.7.6
- What's New in Cloudera Manager 5.7.2
- What's New in Cloudera Manager 5.7.1
- What's New in Cloudera Manager 5.7.0
- What's New in Cloudera Manager 5.6.1
- What's New in Cloudera Manager 5.6.0
- What's New in Cloudera Manager 5.5.4
- What's New in Cloudera Manager 5.5.3
- What's New in Cloudera Manager 5.5.2
- What's New in Cloudera Manager 5.5.1
- What's New in Cloudera Manager 5.5.0
- What's New in Cloudera Manager 5.4.10
- What's New in Cloudera Manager 5.4.9
- What's New in Cloudera Manager 5.4.8
- What's New in Cloudera Manager 5.4.7
- What's New in Cloudera Manager 5.4.6
- What's New in Cloudera Manager 5.4.5
- What's New in Cloudera Manager 5.4.3
- What's New in Cloudera Manager 5.4.1
- What's New in Cloudera Manager 5.4.0
- What's New in Cloudera Manager 5.3.10
- What's New in Cloudera Manager 5.3.9
- What's New in Cloudera Manager 5.3.8
- What's New In Cloudera Manager 5.3.7
- What's New in Cloudera Manager 5.3.6
- What's New in Cloudera Manager 5.3.4
- What's New in Cloudera Manager 5.3.3
- What's New in Cloudera Manager 5.3.2
- What's New in Cloudera Manager 5.3.1
- What's New in Cloudera Manager 5.3.0
- What's New in Cloudera Manager 5.2.7
- What's New in Cloudera Manager 5.2.6
- What's New in Cloudera Manager 5.2.5
- What's New in Cloudera Manager 5.2.4
- What's New in Cloudera Manager 5.2.2
- What's New in Cloudera Manager 5.2.1
- What's New in Cloudera Manager 5.2.0
- What's New in Cloudera Manager 5.1.6
- What's New in Cloudera Manager 5.1.5
- What's New in Cloudera Manager 5.1.4
- What's New in Cloudera Manager 5.1.3
- What's New in Cloudera Manager 5.1.2
- What's New in Cloudera Manager 5.1.1
- What's New in Cloudera Manager 5.1.0
- What's New In Cloudera Manager 5.0.7
- What's New in Cloudera Manager 5.0.6
- What's New in Cloudera Manager 5.0.5
- What's New in Cloudera Manager 5.0.2
- What's New in Cloudera Manager 5.0.1
- What's New in Cloudera Manager 5.0.0
- What's New in Cloudera Manager 5.0.0 Beta 2
- What's New in Cloudera Manager 5.0.0 Beta 1
What's New in Cloudera Manager 5.16.2
Cloudera Manager 5.16.2 is a maintenance release with several fixed issues. See Issues Fixed in Cloudera Manager 5.16.2.
What's New in Cloudera Manager 5.16.1
Cloudera Manager 5.16.1 is the first release of Cloudera Manager 5.16 and also has several fixed issues. See Issues Fixed in Cloudera Manager 5.16.1.
OpenJDK Support
OpenJDK is now supported with Cloudera Manager and CDH 5.16. For information about migrating your Oracle JDK to OpenJDK, see Upgrading the JDK. To use OpenJDK in a new deployment, see Install Java Development Kit.
Cloudera Bug: CDH-60334
Backup and Disaster Recovery (BDR) Log retention
You can now configure the number of days Cloudera Manager retains BDR logs with the Backup and Disaster Log Retention property. For more information, see BDR Log Retention.
Impala
Revised Impala charts library
The Impala chart library has been updated to include more meaningful metrics and remove rarely used plots.
Cloudera Bug: OPSAPS-46020
Impala Daemon's JVM heap size can now be configured
The Impala Daemon's JVM heap size can now be configured now using the Java Heap Size of Impala Daemon in Bytes configuration property. The property defaults to 4GB, and, like all memory parameters, may require tuning.
Cloudera Bug: OPSAPS-41238
New Impala Alert when reaching maximum client connections
A new health check has been added to report and alert when the Impala Daemon is reaching the maximum capacity of concurrent client connections. The Impala Daemon Max Client Connections configuration parameter has been added to configure this value.
Cloudera Bug: OPSAPS-46025
New Impala Idle Query Timeout and Idle Session Timeout configuration properties
The following new configuration properties have been added for Impala: Idle Query Timeout and Idle Session Timeout parameters.
Cloudera Bug: OPSAPS-38917
Impala Assignment Locality health test removed from Cloudera Manager
The Impala Assignment Locality health test has been removed.
Cloudera Bug: OPSAPS-46807
The Solr collection statistics page was not shown for Auditor or Dashboard user roles
A Cloudera Manager Dashboard or Audit user can now see the Solr Collection Statistics and the HBase Table Statistics page.
Cloudera Bug: OPSAPS-45238, OPSAPS-45238
New Kafka Health Tests
Two new Kafka Broker Health Tests have been added to Cloudera Manager: Kafka Broker Swap Memory Usage and Kafka Broker Unexpected Exits. These health tests are enabled by default when Kafka is running as a service in Cloudera Manager version 5.14 and later. See Kafka Broker Health Tests.
Cloudera Bug: OPSAPS-45002
Sentry Object Ownership configurations in Cloudera Manager
A new configuration parameter has been added to the Sentry Configuration page that enables Sentry OWNER privileges (this parameter is disabled by default). For more information, see Setting Object Owner Privileges in Cloudera Manager.
Cloudera Bug: OPSAPS-47434
Downgrade to Cloudera Express disabled when using HDFS encryption at rest
Added a warning to the Downgrade License dialog box if the installation uses HDFS encryption with KMS Key Trustee server and disabled the downgrade in this case. Users are instructed to convert their data to an unencrypted format before proceeding.
Cloudera Bug: OPSAPS-44373
ZooKeeper configuration changes
The Enable Kerberos Authentication and Enable Server to Server SASL Authentication settings in ZooKeeper are now linked together. If either parameter is changed to on or off, the other parameter will automatically change to the same value. A warning message has been added to indicate the effect.
Cloudera Bug: OPSAPS-46628
What's New in Cloudera Manager 5.15.2
Cloudera Manager 5.15.2 is a maintenance release with several fixed issues. See Issues Fixed in Cloudera Manager 5.15.2.
What's New in Cloudera Manager 5.15.1
Cloudera Manager 5.15.1 is a maintenance release with several fixed issues. See Issues Fixed in Cloudera Manager 5.15.1.
Dynamic Resource Pools UI
- root.primaryGroup.username
- root.secondaryExistingGroup.username
- root.[pool name].username.
Previously, only the create="true|false" flag could be added to the inner element of the nestedUserQueue element. This meant that a root.primaryGroup or root.secondaryExistingGroup pool could be created, which was not correct. Now, you can add the create="true|false" flag to the actual nestedUserQueue element as well as the inner element of the nestedUserQueue element. An additional restriction is that if root.<parent>.username should use an existing pool (create = false), then root.<parent> must also use an existing pool.
New fields in Cloudera Manager Python API for BDR
- raiseSnapshotDiffFailures to ApiHdfsReplicationArguments
- remainingTime, throughput, and estimatedCompletionTime to ApiHdfsCloudReplicationArguments
- numThreads, runInvalidateMetadata to ApiHiveReplicationArguments
- displayeName, description to ApiReplicationSchedule
These fields were already exposed in the Java Cloudera Manager API.
What's New in Cloudera Manager 5.15.0
- Backup and Disaster Recovery (BDR)
- ADLS - You can now replicate HDFS files and Hive data to and from Microsoft ADLS. To use ADLS as the source or destination, you must add Azure credentials
to Cloudera Manager.
Note that the lowest supported version for Cloudera Manager and CDH for BDR replication to/from ADLS is version 5.15.0.
- Metrics - The amount of data read and written through Amazon S3 and Microsoft ADLS storage by MapReduce jobs can now be viewed as cluster metrics, such as s3a_bytes_read and adl_bytes_written.
- Multi-threaded import and export for Hive Replication - You can now configure the number of threads used for import and export during Hive Replication.
When you create or modify a Hive Replication schedule, configure the Number of concurrent HMS connections on the Advanced tab.
Increasing the number of threads can improve BDR performance. By default, any new replication schedules will use 5 connections.
If you set the value to 1 or more, BDR uses multi-threading with the number of connections specified.
If you set the value to 0 or fewer, BDR uses single threading and a single connection.
The source and destination clusters must both run Cloudera Manager 5.15.0 or higher in order to use this feature.
- Security - To improve security, BDR now uses an encrypted Hadoop credential store to authenticate with cloud providers, such as Amazon S3 or Microsoft ADLS, when backing up or restoring HDFS or Hive data.
- Statistics - Hive Replication phases now show the number of Hive Objects found/processed. Each type of Hive object is represented separately: databases, tables, indexes, functions, partitions and column statistics. This information can be used to determine how many objects are replicated in each run. This can also be used to deduce how long will it take for Hive Replication to complete.
-
Snapshot diff-based replication - This feature compares two HDFS snapshots to reduce the number of files scanned during the copy-listing phase of replication. It can speed up replication performance when large number of files are unchanged between replications.
You must enable immutable snapshots for HDFS to use snapshot diff-based replication for BDR.
This feature is on by default. You can configure replications to abort on snapshot diff failure when you create or edit a replication schedule.
See the following pages for guidelines on using snapshot diff-based replication: guidelines for Hive replication and guidelines for HDFS replication.
- ADLS - You can now replicate HDFS files and Hive data to and from Microsoft ADLS. To use ADLS as the source or destination, you must add Azure credentials
to Cloudera Manager.
- HDFS - You can now enable immutable snapshots for HDFS with Cloudera Manager. Enabling this feature enables snapshot diff-based copy listing for BDR.
In the Cloudera Manager Admin Console, navigate to Enable Immutable Snapshots.
and search for - Maintenance and Support
- Cluster Restart - Improved cluster restart performance.
- Kudu - Cloudera Manager now supports gathering the output of the ksck diagnostic tool from Kudu. This output is now collected in the diagnostic bundles.
- Impala
- File Handle Cache - You can now configure and monitor the following Impala parameters using Cloudera Manager: max_cached_file_handles and unused_file_handle_timeout_sec .
- KRPC Port - You can now configure the krpc_port startup parameter using Cloudera Manager. The default value is 27000
-
Metrics - Cloudera Manager now collects the following metrics for Impala: impala_jvm_heap_committed_usage_bytes, impala_jvm_heap_current_usage_bytes, impala_jvm_heap_init_usage_bytes, impala_jvm_heap_max_usage_bytes. Impala administrators may find these metrics useful for monitoring Catalog size and Impala Daemon health and the amount of memory used by the JVM embedded in the IMpala Daemon process. These are useful for understanding memory consumption, particularly the memory consumption of the Catalog cache stored in coordinator Impala Daemons.
- Parcels - When configuration changes are made on the Parcel Update Frequency to '0' on the page. page, Cloudera Manager now automatically checks for any new parcels. Additionally, you can disable recurring checks for parcels by setting the
- Upgrade
- Agents -
Cloudera Manager Agent upgrade now works with mixed environments where there are different distribution/versions of operating systems in a single Cloudera Manager deployment. These agents can be upgraded as a group for each set of operating systems.
The agents are grouped and displayed on a new page in the Cloudera Manager Upgrade wizard.
- Documentation - You can now find a link to the latest upgrade documentation in the Cloudera Manager Admin Console by going to Upgrade documentation now includes new interactive features that allow you to select your operating system, upgrade version, database type, CDH installation type (Parcels or Packages), and other features, and a customized page displays only the steps required for your upgrade. See:
.
- Summary Page - All potential issues, conflicts, action items, and pre-upgrade checks are summarized on the first page of the CDH Upgrade wizard.
- Agents -
What's New in Cloudera Manager 5.14.4
Cloudera Manager 5.14.4 is a maintenance release with several fixed issues. See Issues Fixed in Cloudera Manager 5.14.4.
What's New in Cloudera Manager 5.14.3
Cloudera Manager 5.14.3 is a maintenance release with several fixed issues. See Issues Fixed in Cloudera Manager 5.14.3.
What's New in Cloudera Manager 5.14.2
Cloudera Manager 5.14.2 is a maintenance release with several fixed issues. See Issues Fixed in Cloudera Manager 5.14.2.
What's New in Cloudera Manager 5.14.1
Cloudera Manager 5.14.1 is a maintenance release with several fixed issues. There is no corresponding CDH 5.14.1 release. See Issues Fixed in Cloudera Manager 5.14.1.
What's New in Cloudera Manager 5.14.0
- ADLS
-
You can now use Cloudera Manager to configure Microsoft Azure credentials for cluster access to ADLS. This access is enabled for running Hive and Impala queries on tables backed by data stored in ADLS and to browse ADLS data using Hue. See Configuring ADLS Access Using Cloudera Manager.
-
ADLS integration with Hue added. Hue can be setup to enable browsing files in Azure Data Lake Store, and Hue users can directly query and store data in ADLS without any intermediate moving or copying of data to and from HDFS.
-
- Backup and Disaster Recovery (BDR)
- Added the option to Skip Checksum on Listing. This option skips the checksum comparison between two files to determine whether they are the same or not. To detect modifications, BDR will use the file size and last modified time instead. Skipping the checksum on listing can improve performance.
- You can now perform a rolling restart of clusters or services such as HDFS or Hive while replication jobs are running.
- On the Replication Schedules page, you can see the Throughput and Progress for replications that are running. Additionally, this information can be seen in the log files.
- Added Advanced Configuration Snippets for hdfs-site.xml and core-site.xml that can be used for HDFS replication. You can use these to tune behavior such as the HDFS replication factor of files written by BDR. Previously, you had to use the HDFS Client Advanced Configuration Snippet, which would affect all HDFS client configuration on the target cluster.
- Custom Service Descriptor (CSD)
-
Added a new capability for custom services. CSD authors can now specify, in the topology descriptor for a role type, that only an odd number of instances should be running.
- The CSD versioning logic has been improved. When a CSD loads during Cloudera Manager startup, only the latest version of a CSD service type is loaded. The version number has precedence over the CDH compatibility range.
-
- Diagnostic Bundles
Diagnostic Bundles now include the following additional data: cluster utilization report data and Cloudera Navigator dashboard data (if the cluster has a Cloudera Navigator instance).
- Health Tests and Monitoring
- Added a new health test for Linux host entropy monitoring. By default, the health test issues a warning for entropy < 100 and an alert for entropy < 50. Additionally, a chart is available in the Charts Library.
- Improved Kafka health monitoring:
- Cloudera Manager now aggregates Kafka Broker health to the health of the Kafka service. If greater than 5% percent of the brokers' health statuses are concerning, the Kafka service's health will be set to concerning. If 50% or more of the brokers' health statuses are bad, then the service's health will be set to bad.
- If at least one partition hosted by a broker is offline, the service’s health will now be bad.
- If replication of any partitions hosted by a broker is lagging, then the health of the broker will be set to concerning.
- Host Maintenance Mode
Cloudera Manager now provides more options for performing host maintenance and decommissioning. In addition to decommissioning, you can now specify in the same dialog whether to suppress alerts from the decommissioned host and, for hosts running a DataNode role, you can specify whether or not to replicate under-replicated data blocks to other DataNodes during maintenance. This feature takes the DataNode into "offline" state and is useful when performing minor maintenance on hosts, such as adding memory or changing network cards or cables where the maintenance window is expected to be short, the disk drives on the host remain in place, and the extra cluster resources would be consumed in replicating blocks is undesirable. See Tuning and Troubleshooting Host Decommissioning.
- Installation
- After a host installation is completed with the Cloudera Manager Admin Console or API, a command on the Commands page provides a link to a zipped file that contains the installation logs for each host.
- You can now add a service to a cluster that does not have any services. Previously, you could not add services to a cluster without deleting and re-adding the cluster if the cluster was empty.
- Logs
The cloudera-scm-agent.out directory now uses the same location as Agent log files. Specify the location in the Agent config.ini file.
- Set default Oozie Configuration
Added a new Oozie Advanced Configuration Snippet for action-conf/default.xml. Use this Advanced Configuration Snippet to propagate default configuration properties to all Oozie actions. OPSAPS-42546
- Resource Pools
- You can now specify a percentage for the maximum CPU and memory resources for a YARN resource pool. Previously, you could only specify a static value.
- Dynamic resource pools can be sorted on the configuration page, status page, and utilization reports page in the Cloudera Manager Admin Console.
- Search and Sentry
The Keystore Indexer (HBase indexer) can be configured to use Sentry service for authorization.
What's New in Cloudera Manager 5.13.3
Cloudera Manager 5.13.3 is a maintenance release with many fixed issues. See Issues Fixed in Cloudera Manager 5.13.3.
What's New in Cloudera Manager 5.13.2
Cloudera Manager 5.13.2 is a maintenance release with many fixed issues. See Issues Fixed in Cloudera Manager 5.13.2.
What's New in Cloudera Manager 5.13.1
Cloudera Manager 5.13.1 is a maintenance release with many fixed issues. See Issues Fixed in Cloudera Manager 5.13.1.
What's New in Cloudera Manager 5.13.0
- Cloudera Data Science Workbench
Cloudera Data Science Workbench is now available as an add-on service for Cloudera Manager. To this end, Cloudera Data Science Workbench is now distributed in a parcel that integrates with Cloudera Manager using a Custom Service Descriptor (CSD). You can use Cloudera Manager to install, upgrade, and monitor Cloudera Data Science Workbench. Additionally, diagnostic data bundles for Cloudera Data Science Workbench can be generated and submitted to Cloudera through Cloudera Manager.
- Dashboard User Role
Users assigned the Dashboard User can perform the following actions:
- Create, edit, or remove their own dashboards
- Create new charts or add existing charts to their own dashboards
- View data in Cloudera Manager
- View service and monitoring information
- Impala Query Profiles Time Display
Impala query profiles downloaded from Cloudera Manager in text format now include a human-readable version for the value of each profile counter instead of the raw value. For example, for time counters before CM 5.13 only the raw nanoseconds value was shown: TotalTime: 492626971556. Now a human-readable value is shown alongside the raw value: TotalTime: 8.2m (492626971556).
- Sentry High Availability
The Sentry Service now supports High Availability. See Sentry High Availability.
- Licensing Management Improvements
- License Information
A banner now indicates when there are 60, 30, 14, or 0 days left on a license.
- Downgrading from Cloudera Enterprise to Cloudera Express
Previously, downgrading from Enterprise to Express required editing the Cloudera Manager database. Now this task can be performed by the user by using the Downgrade button on the license page.
- License Information
- Enable Kerberos Wizard shows warning for hostnames with uppercase letters
Because Kerberos principal names cannot include upper-case letters, the Enable Kerberos wizard welcome screen will show a warning if any hostnames containing uppercase characters are detected. However, the user will be able to continue with the wizard regardless of this warning. When the warning is shown, up to 10 such detected hostnames will be listed. If the list is longer than 10, the message also says only 10 are shown.
- New Validations for Hadoop Configuration Properties User impersonation
A new validation has been added for the various username configuration properties used by the Service Monitor to ensure that the properties are valid Linux usernames. The properties affected by this check are:
- HDFS User to Impersonate - HDFS service
- HBase User to Impersonate - HBase service
- MapReduce User to Impersonate - MapReduce service
- YARN Container Usage MapReduce Job User - YARN service
If these properties are not valid, the respective services will not start.
- Additions to the Cloudera Manager API
- New Cloudera Manager API to control MaintenanceMode for Management Services
Added REST API endpoints for Cloudera Management Service and Cloudera Management Service roles to enter and exit maintenance mode. See:
- Enter Maintenance Mode for Cloudera Management Service:http://cloudera.github.io/cm_api/apidocs/v18/path__cm_service_commands_enterMaintenanceMode.html
- Exit Maintenance Mode for Cloudera Management Service:http://cloudera.github.io/cm_api/apidocs/v18/path__cm_service_commands_exitMaintenanceMode.html
- Enter Maintenance Mode for Cloudera Management Service roles:http://cloudera.github.io/cm_api/apidocs/v18/path__cm_service_roles_-roleName-_commands_enterMaintenanceMode.html
- Exit Maintenance Mode for Cloudera Management Service roles:http://cloudera.github.io/cm_api/apidocs/v18/path__cm_service_roles_-roleName-_commands_exitMaintenanceMode.html
- New Cloudera Manager API for Cluster Utilization Report
Three new API endpoints have been added that return the cluster utilization reports in CSV format. See:
- Cluster Utilization:http://cloudera.github.io/cm_api/apidocs/v18/path__clusters_-clusterName-_utilization.html
- Impala Utilization: http://cloudera.github.io/cm_api/apidocs/v18/path__clusters_-clusterName-_impalaUtilization.html
- YARN Utilization: http://cloudera.github.io/cm_api/apidocs/v18/path__clusters_-clusterName-_yarnUtilization.html
- Delete credentials API at cluster level
There is a new API endpoint that allows you to delete all of the Kerberos credentials of services belonging to a single cluster, rather than all credentials generated by Cloudera Manager for all of its managed clusters. See:
- New Cloudera Manager API to control MaintenanceMode for Management Services
- New Guardrails for Operating Cloudera Manager at Scale
- New Validator for Management Roles
A new validator warns when if multiple Management roles are running on the same host when Cloudera Manager manages more than 80 hosts.
- New Validators for Service Monitor and Host Monitor Memory Allocations
New validators have been added for Service and Host Monitor that give warnings if the heap and non-java memory for these roles are below the recommended values for the cluster. These validators depend on the size of the clusters, as well as on the types of services running in the cluster.
- New Validator for Management Roles
- Solr Chart Library
Chart Library for Solr now contains example charts for every metric available for the Solr Service.
- Improved BDR Performance
BDR replication performance has been improved by running the first phase of replication on the source cluster (the copy-listing phase, which creates a list of files and folders to be copied). This can dramatically improve performance in scenarios where there is high latency between the source and destination clusters. This feature requires Cloudera Manager 5.13 or higher on both the source and target cluster and can be disabled by setting a feature flag with the API.
- New Configuration Property for Descriptors
A new property, scm.server.proxy.timeout, has been added for configuring the Descriptor fetch timeout in the Cloudera Manager Admin Console. This is useful when tuning Cloudera Manager for very large deployments. Previously, this value was configured at the service level in various Advanced configuration snippets.
You can find the property by navigating to
.Cloudera Bug: OPSAPS-41578
- CSD Health Reporting
Added support for determining the health of a CSD service based on the health of its roles. For more information, see the Health Aggregation section in the CSD Documentation.
- New Validator for Banned YARN Users
A new validator has been added to ensure that the Banned System Users list is the same across all YARN NodeManagers when using Kerberos authentication. The YARN service does not start if the validation fails. To see the list of banned users, select the YARN cluster and navigate to Configuration. Search for the banned.users property.
- Resume Rolling Upgrade
When running a Rolling Restart as part of an upgrade, you can resume the rolling restart after fixing problems that caused the upgrade to fail because one or more hosts did not successfully restart. After you fix the problems you can now resume the rolling restart and Cloudera Manager will skip restarting roles that have already successfully restarted. This change speeds up retrying rolling restarts for large clusters.
- Diagnostic Bundles Collection for Upgrade Failures
During a cluster upgrade, if there is a failure, Cloudera Manager now allows you to send a diagnostic bundle to Cloudera support. The Upgrade Wizard opens the Send Diagnostic Data dialog box with the current cluster name and time duration pre-populated.
- New Impala metrics for hedged reads, JVM heap usage and connection setup queue
size
New metrics have been added JVM Heap usage of the Catalog Server and Hedged reads.
- Support for Sentry with a highly available Hive Metastore
It is now possible to use HDFS Sentry Sync when running a Hive Metastore using high availability.
- New placement rules for CSD Services
A new placement rule for CSD-based services has been added to the Service Descriptor Language called alwaysWithAny. When this rule is present, the specified role must always be placed on the same host where the roles specified in the rule are placed. The specified role no longer appears in the wizard when adding this service. Instead, one instance of this role is automatically placed on any host that has at least one of the primary roles. If more than one of the primary roles are themselves placed in the same host, then only one instance of this role is automatically placed on that host. There should be at least two unique primary roles defined in the alwaysWithAny rule. And, the alwaysWithAny rule is mutually exclusive to the alwaysWith rule and they should not be defined together for the same role. If a user assigns roles in a way that violates this placement rule, the service shows a configuration error and fails to start. See the Service Descriptor Language Reference.
- Navigator search and tagging are now on by default
The Navigator search and tagging features of Hue are now enabled by default when adding a new Hue service to a cluster running CDH 5.12 or higher.
- Add protocol, accept_count, and acceptor_thread_count parameters to
LUNA_KMS and THALES_KMS CSDs
New performance tuning parameters related to Tomcat have been exposed within the KMS services designed to work with the Luna and Thales Hardware Security Modules (HSM). These parameters only take effect when running CDH5.12.1 and higher. The following parameters were added:
- protocol
- accept_count
- acceptor_thread_count
What's New in Cloudera Manager 5.12.2
Cloudera Manager 5.12.2 is a maintenance release with many fixed issues. See Issues Fixed in Cloudera Manager 5.12.2.
What's New in Cloudera Manager 5.12.1
Cloudera Manager 5.12.1 is a maintenance release with many fixed issues. See Issues Fixed in Cloudera Manager 5.12.1.
What's New in Cloudera Manager 5.12.0
- Backup and Disaster Recovery
- Refreshing Impala metadata during replication
You can now use an option in the Cloudera Manager Admin Console to configure BDR to automatically refresh Impala’s metadata cache in the destination cluster during replication. Previously, this feature required an Advanced Configuration Snippet (Safety Valve). See Invalidating Impala Metadata.
- Automatic renewal of Kerberos tickets and Delegation Tokens
Previously, BDR replication jobs would fail on a Kerberized cluster if the job duration was longer than the renewal interval for the HDFS delegation token. With this fix, both the delegation token and Kerberos ticket are renewed until the max lifetime of token/ticket (default value is 7 days). This enables longer replications without needing to bring down the source cluster to change the ticket timeout.
- Streamlined Kerberos Configuration
- As part of Test Connectivity for peers, Cloudera Manager now tests for properly configured Kerberos authentication on the source and destination clusters. Test Connectivity runs automatically when you add a peer for replication, or you can manually initiate Test Connectivity from the Actions menu. This feature is available when the source and destination clusters run Cloudera Manager 5.12 or higher. See Enabling Replication Between Clusters with Kerberos Authentication.
- If Cloudera Manager is managing the Kerberos configuration (krb5.conf) for your clusters, BDR can automatically make some required changes to your Kerberos configuration based on issues found during the Test Connectivity action.
- The configuration process for adding peers when using Kerberized clusters is simplified if both the source and target clusters use Cloudera Manager 5.12 or later. Now, you only need to setup trust on the target cluster and not the source, reducing the complexity of enabling Hive Replication. See Enabling Replication Between Clusters with Kerberos Authentication
- Add a name and description to replication schedules
When you create or edit a replication schedule, you can add a name on the General tab and add a description on the Advanced tab.
- Refreshing Impala metadata during replication
- Hive Metastore Schema Integrity Checker
Cloudera Manager now uses the Hive Metastore schemaTool for validating the integrity of Hive metadata. When you upgrade a cluster that contains a Hive Service to CDH 5.12 or higher using the Cloudera Manager Upgrade Wizard or command line, before upgrading the Hive metastore schema, Cloudera Manager first runs a validation check to detect any corruption. If the validation check fails, Cloudera Manager displays the error and stops the upgrade. Corruption issues should be resolved before proceeding with the upgrade.
- Support for HSM Key Provider
The HDFS Encryption Wizard in Cloudera Manager now supports configuration of the Hardware Security Module (HSM) Key Providers supported by CDH 5.12 for encryption key management.
- Sending Diagnostic Bundles
The user interface in the Cloudera Manager Admin Console for collecting and sending diagnostic bundles has been improved. Regardless of how diagnostic data collection is configured before you start, each time you create a bundle, you can now select one of the following options: Collect and Upload Diagnostic Data to Cloudera Support or Collect Diagnostics Data only. Additionally, the Cloudera Manager Admin console better indicates the status of the bundle. For example, showing whether or not the bundle was successfully sent to Cloudera.
- Delete Kerberos Service Principals
You can now delete MIT Kerberos or Active Directory Service Principals that were previously created by Cloudera Manager while Kerberizing a cluster using the delete_credentials API.
- HBase Region in Transition Health Check
Cloudera Manager now performs a health check to detect whether HBase regions have become stuck in transition during splitting and merging operations.
- Replication factor for MapReduce job submission files
New auto-configuration logic for MR1 and MR2's Submit Replication Factor property attempts to choose a value that is at least the value of the HDFS Replication Factor for clusters with three or more DataNodes. Additionally, a new configuration validator raises a configuration warning if the existing Submit Replication Factor is lower than the HDFS Replication Factor if the cluster has at least 3 DataNodes.
- Custom Header Color
You can customize the header color that Cloudera Manager displays in the web UI. Select Other for the Category and use the drop-down menu for Custom Header Color.
. Select - Dynamic Resource Pools UI
The Dynamic Resource Pools user interface now displays Access Control information about resource pools, showing whether they are freely usable, restricted to a custom set of users/groups, or inherit ACLs from their parent pool.
- Example Impala Shell Command
The Impala Service Status Page now includes an example Impala Shell Command.
- Configurable S3 Endpoint
The S3 Connector service now allows you to configure the default S3 endpoint used by HDFS clients (including Hive and Impala), ensuring all S3 data created/accessed by your cluster is (by default) stored in the AWS region of your choice. Additionally, Hue is configured to automatically use the default endpoint as the S3 Connector.
- Solr
- Request Rate and Index Size Charts
The graphs on the Solr status page now include the request rates against the service and the aggregate size of the indices.
- New Tags in Solr Logs
The logging for Solr has been improved. Logs now include the following IDs: thread, shard, replica, and collection.
- Request Rate and Index Size Charts
What's New in Cloudera Manager 5.11.2
Cloudera Manager 5.11.2 is a maintenance release with many fixed issues. See Issues Fixed in Cloudera Manager 5.11.2.
- New Tags in Solr Logs
The logging for Solr has been improved. Logs now include the following IDs: thread, shard, replica, and collection.
What's New in Cloudera Manager 5.11.1
Cloudera Manager 5.11.1 is a maintenance release with many fixed issues. See Issues Fixed in Cloudera Manager 5.11.1.
What's New in Cloudera Manager 5.11.0
- Amazon S3
- Amazon S3 Consistency with Metadata Caching (S3Guard)
Data written to Amazon S3 buckets is subject to the "eventual consistency" guarantee provided by S3, which means that data written to S3 may not be immediately available for queries and listing operations. This can cause failures in multi-step ETL workflows, where data from a previous step is not available to the next step. To mitigate these consistency issues you can now configure metadata caching for data stored in Amazon S3 using S3Guard. Some workloads that access S3 may also see modest performance improvements with metadata caching. S3Guard requires that you provision a DynamoDB database from Amazon Web Services and configure S3Guard using the Cloudera Manager Admin Console or command-line tools. See Configuring and Managing S3Guard.
- Amazon S3 Consistency with Metadata Caching (S3Guard)
- Operating System Support
- SLES 12 SP2 Support
SLES 12, SP2 is now supported as of Cloudera Manager and CDH 5.11 and higher.
- Mixed Operating system support for gateway hosts running Cloudera Data Science Workbench
A Gateway host that is dedicated to running Cloudera Data Science Workbench can use RHEL/CentOS 7.2 even if the remaining hosts in your cluster are running any of the other supported operating systems. All hosts must run the same version of the Oracle JDK.
- SLES 12 SP2 Support
- Backup and Disaster Recovery
- Refreshing Impala metadata during replication
You can now configure Hive/Impala replication jobs to run the INVALIDATE METADATA Impala statement in the destination cluster automatically at the end of the replication process, allowing newly replicated data to be immediately queried by Impala. See Invalidating Impala Metadata.
- Hive Replication to Amazon S3 now supported for regions that support only Signature Version 4 signing protocol
Replications from Hive or HBase to Amazon S3 are now supported for S3 regions that only support Amazon's Signature Version 4 signing protocol. You must add the fs.s3a.endpoint property to the Cluster-wide Advanced Configuration Snippet (Safety Valve) for core-site.xml property and set its value to the Amazon S3 region. For example:
<property> <name>fs.s3a.endpoint</name> <value>s3.us-east-2.amazonaws.com</value> </property>
You can access this property in Cloudera Manager at
.
- Refreshing Impala metadata during replication
- Peak Memory Usage Filter now tracked per container for YARN applications
Peak container memory usage is now tracked for YARN applications and new filter attribute, Used Memory Max has been added for monitoring YARN applications.
- Improved Kerberos-Encryption-Type Handling by Cloudera Manager
Cloudera Manager validates the Kerberos encryption type as it is being entered into the Cloudera Manager Admin Console, and displays an error message if the type is not a valid MIT or Microsoft Active Directory (Kerberos) encryption type. Administrators can disable the feature when necessary—for example, if new encryption types added to Kerberos are ahead of the encryption types supported by Cloudera Manager (invalid encryption types fail, regardless of warning message display).
- Enabling SPNEGO authentication for Hue
Enabling the Hue Authentication Backend property (for SPNEGO) now automatically adds all necessary environments and kerberos credentials. Previously, you needed to follow this procedure: Enabling SPNEGO as an Authentication Backend for Hue.
- New and Changed Configuration
- Auto-configuration of HBase Thrift Authentication when Kerberos is enabled
When Kerberos is enabled for the cluster, the value of the HBase configuration parameter HBase Thrift Authentication is automatically set to auth-conf. Clusters that already have Kerberos enabled will not have this setting changed when upgrading Cloudera Manager, this only affects enabling Kerberos in the future.
- New HDFS NameNode configuration property for deleting the trash
A new HDFS NameNode property, Filesystem Trash Checkpoint Interval (fs.trash.checkpoint.interval) has been introduced with a default value of 1 hour. This property causes the NameNode better respect and accurately enforce the configured HDFS trash deletion interval set with the Filesystem Trash Interval property (fs.trash.interval).
The old behaviour without this property accidentally caused many files in the HDFS trash to be deleted only when twice the desired trash deletion interval had transpired because the checkpoint interval matched the deletion interval. If the older implicit behaviour of retaining trash files for a longer time is desired, consider raising the value of the Filesystem Trash Interval property to a more suitable value. When upgrading to this version of Cloudera Manager or when changing this property, all HDFS NameNode role instances will be marked stale, and you must restart the HDFS NameNode role instances and their dependent services for this change to take effect.
- New Auto Logout Timeout property for Hue
A new configuration property has been added for the Hue service. The Auto Logout Timeout property controls how long the Hue browser can remain idle before automatically logging out the user. Set the property to -1 to disable automatic logout. To configure the property, go to the Hue service, select the Configuration tab and search for the property.
- New performance tuning properties for Key Management Server (KMS)
The following new properties have been added for tuning the performance of the KMS service:
- KMS Accept Count
- KMS Handler Protocol
- KMS Acceptor Thread Count
- Auto-configuration of HBase Thrift Authentication when Kerberos is enabled
- New API endpoint for refreshing parcel information
A new REST API endpoint has been added to refresh parcel information from both local and remote repositories. The endpoint URL is:
/api/v16/cm/commands/refreshParcelRepos
- New Metrics and Health Tests for Service Monitor and Host Monitor metric collection
A new metric, mgmt_aggregation_run_duration, has been added to the Service Monitor and Host Monitor metrics to indicate how much time it takes to store metrics collected in last minute. This metric can be used to determine if more heap or non-heap memory is needed for these roles.
New Health Tests, Host Monitor Metrics Aggregation Run Duration Test and Service Monitor Metrics Aggregation Run Duration Test have also been added to detect potential resource configuration issues with service monitor and host monitor.
- New validation for YARN NodeManager log directory
Cloudera Manager now validates whether all YARN NodeManagers are storing logs in the same distributed file system directory so that no logs are missing from Job History Server. If NodeManagers have different configuration values, there will be a configuration error after upgrading Cloudera Manager to 5.11.
- New Dynamic Resource Pool option
Configuration of Nested User Pools (except existingSecondaryGroup) now includes a Create pool if it does not exist checkbox to indicate whether to create a sub-pool.
What's New in Cloudera Manager 5.10.2
Cloudera Manager 5.10.2 is a maintenance release with many fixed issues. See Issues Fixed in Cloudera Manager 5.10.2.
What's New in Cloudera Manager 5.10.1
Cloudera Manager 5.10.1 is a maintenance release with many fixed issues. See Issues Fixed in Cloudera Manager 5.10.1.
What's New in Cloudera Manager 5.10.0
- Backup and Disaster Recovery
-
Change of default behavior for Impala metadata setting in Hive replications
There are now three options for configuring replication of Impala metadata for Hive replication jobs:- No – Impala metadata is not replicated.
- Yes – Impala metadata is replicated.
- Auto – Cloudera Manager decides the value for this option based on the version of CDH in your cluster.
In Cloudera Manager 5.9, when creating a Hive replication schedule, the Replicate Impala Metadata option was not selected by default (false). In Cloudera Manager 5.10, the value defaults to Auto, so that Cloudera Manager decides what the value should be.
- Replication of Impala Cached Column statics
Table and partition-level column statistics stored in the Hive metastore and used by Impala are now replicated during Hive Replication. This is supported between a replication source with Cloudera Manager running version 5.10 or higher and a replication target running Cloudera Manager 5.10 or higher. Because this change replicates more information, the same schedule may take more time to complete if column statistics are present.
- Performance summaries of HDFS and Hive replication jobs
You can download full performance reports for HDFS and Hive replications from the Replication Schedules page and from the Replication History page. You can also filter the output by error, deleted, or skipped status. See Monitoring the Performance of Hive/Impala Replications and Monitoring the Performance of HDFS Replications.
- HDFS replication performance monitoring now reports an initial sample
HDFS performance reports now display early samples during the start of a replication job to give users earlier information about the progress of the job.
- Scheduler Pool field for replications moved to Resources tab
When creating replication schedules, the Scheduler Pool input field has been moved to the Resources tab.
-
- Amazon S3
- Amazon S3 Object Store
You can configure auxiliary storage in your cluster using Amazon Simple Storage Service (S3). Client applications such as YARN, MapReduce, or Spark can access data stored in Amazon S3 using Amazon S3 URLs and credentials. See Configuring the Amazon S3 Connector.
- Amazon S3 Server-side Encryption (SSE-S3)
Clusters that use Amazon S3 storage can encrypt data using Amazon server-side encryption (SSE-S3). Use Cloudera Manager Admin Console to configure the cluster to use this new feature as detailed in How to Configure Encryption for Amazon S3.
- Auto-configuration of Hue with AWS credentials (S3 access)
Cloudera Manager now automatically configures S3 credentials for Hue in the hue.ini file.
- Amazon S3 Object Store
- Resource Management
- maxResources per user now configurable in YARN Dynamic Resource Pools
YARN Dynamic Resource Pools now support default capacity limits which automatically apply to all child pools of a resource pool. Using these limits, you can now effectively control the YARN resources available to any user or group in the cluster by configuring these settings on a parent pool and then using placement rules to auto-create child pools per user or group.
- maxResources per user now configurable in YARN Dynamic Resource Pools
- Key Management Service (KMS)
- Java Option configuration added for KMS and Key Trustee KMS
Additional Java Configuration Options can now be set for the KMS and Key Trustee KMS using Advanced Configuration Snippets. This allows customized Java configuration options for the KMS process.
- Revised ACLs for KMS
The ACL set generated by the KMS installation wizard has been updated to implement the recommended secure ACL policies for common key names. For more information, See Configuring KMS Access Control Lists (ACLs).
- Java Option configuration added for KMS and Key Trustee KMS
- Security
- Redaction of stdout and stderr logs
Redaction logic now applies to the output streams of a service to stdout and stderr logs to redact sensitive information from these files.
- Encryption of Cloudera Manager database password
By default the Cloudera Manager database password is stored as clear text in the /etc/cloudera-scm-server/db.properties file. You can now specify a program to execute whose standard output is the password. If com.cloudera.cmf.db.password is not found in /etc/cloudera-scm-server/db.properties, then the property com.cloudera.cmf.db.password_script, if it exists, is used. The value of this property is executed as a program, and the value returned to stdout is used as the password.
- Redaction of stdout and stderr logs
- Monitoring
- Collection of Metrics can be disabled for specified roles
A new configuration has been introduced at the role level to disable monitoring by the Cloudera Manager Agent for each individual role. By default, monitoring is enabled for all roles. Once disabled, you must restart the role to make the change take effect. This can help in scenarios where the Cloudera Manager Service Monitor is experiencing performance issues and does not have enough resources. See Disabling Metrics for Specific Roles.
- New chart display number of monitored entities
A new chart displays on the Cloudera Management Service status page that shows the number of monitored entities by service monitor and host monitor.
- Default timeouts configurable for Service Monitor, Host Monitor, Reports Manager, Events Server, and Activity Monitor
The default timeout is now configurable for Service Monitor, Host Monitor, Reports Manager, Events Server, and Activity Monitor to register themselves to a cluster. Users can change the timeout by increasing the values of the Descriptor Fetch Max Tries property (the default value is 5) and the Descriptor Fetch Tries Intervalproperty (the default value is 2).
- New Health Test for Service Monitor heap usage.
A health test has been added to monitor heap usage of the Service Monitor. This test can be useful when diagnosing out of memory errors in the Service Monitor.
- Collection of Metrics can be disabled for specified roles
- Users can send feedback on Cloudera Manager
You can go to
to send feedback to Cloudera about Cloudera Manager. - Suppression of warning about embedded database for Cloudera Manager.
When Cloudera Manager is configured to use the embedded PostgreSQL data base, it displays a banner warning that the embedded PostgreSQL database is not recommended for production environments. You can now suppress this banner: Go to
. - New HDFS balancer related configuration options
HDFS balancer can now be configured to specify which hosts are included and excluded or which hosts are used as sources for transferring replicas. Additional properties for tuning the performance of the balancer can now also be configured starting for CDH 5.10.0 and higher.
The following configuration will be marked stale: dfs.datanode.balance.max.concurrent.moves. You can safely ignore the warning and defer restarting.
- Removing service dependencies
Previously, when the user tries to delete a service that another service depends on, Cloudera Manager displays a dialog box explaining that the dependent service should be deleted. There is now a link that takes user to the configuration page where they can remove this dependency.
- HBase topology files now deployed in client configurations
When the HBase property Enable Replication To Secondary Region Replicas is enabled, the topology.py and topology.mapp files are deployed with the client configuration and the Topology Script File Name property is set to the deployed topology.py path.
- The length of Impala queries retained by the Service Monitor is now configurable.
The maximum size of impala queries is now configurable and the limit the default size to 10k chars.
What's New in Cloudera Manager 5.9.3
Cloudera Manager 5.9.3 is a maintenance release with many fixed issues. See Issues Fixed in Cloudera Manager 5.9.3.
What's New in Cloudera Manager 5.9.2
Cloudera Manager 5.9.2 is a maintenance release with many fixed issues. See Issues Fixed in Cloudera Manager 5.9.2.
What's New in Cloudera Manager 5.9.1
Cloudera Manager 5.9.1 is a maintenance release with many fixed issues. See Issues Fixed in Cloudera Manager 5.9.1.
What's New in Cloudera Manager 5.9.0
- Creating Virtual Machine Images
Documentation has been added with procedures to create virtual images of Cloudera Manager and cluster hosts. See Creating Virtual Images of Cluster Hosts.
- Security
- External/Cloud account configuration in Cloudera Manager
Account configuration for access to Amazon Web Services is now available through the centralized UI menu External Accounts.
- Key Trustee Server rolling restart
Key Trustee Server now supports rolling restart.
- External/Cloud account configuration in Cloudera Manager
- Backup and Disaster Recovery
- You can now replicate HDFS files and Hive data to and from an Amazon S3 instance. See HDFS Replication To and From Cloud Storage and Hive/Impala Replication To and From Cloud Storage.
- There are some new tuning options to improve performance of HDFS replication. See HDFS Replication Tuning.
- You can now download performance data about HDFS replication jobs from the Replication Schedules and Replication History pages. See Monitoring the Performance of HDFS Replications.
- Hive replication now stores Hive UDFs in the Hive metastore. Replication of Impala and Hive User Defined Functions (UDFs).
- The user interface for creating replication schedules has been reorganized to present the configuration options on three tabs: General, Resources, and Advanced.
- Uncheck Replicate Impala Metadata by default
When creating a Hive replication schedule, the option Replicate Impala Metadata was checked (true) by default. In Cloudera Manager 5.9 and higher, the value is unchecked (false) by default.
- YARN BDR enhancement
YARN jobs now include the BDR schedule ID that launched the job so you can connect logs with existing schedules, if multiple schedules exist.
- Resource Management
- Custom Cluster Utilization Reports
Documentation has been added to create custom Cluster Utilization reports that you can export data from. See Creating a Custom Cluster Utilization Report.
- New settings for continuous scheduling
For new installs, default values for configurations have been changed. yarn_scheduler_fair_continuous_scheduling_enabled is set to false. resourcemanager_fair_scheduler_assign_multiple is set to 'true'. Existing settings are preserved when you upgrade from a lower version.
- YARN historical reports by user show pool-user entity
When Cloudera Manager manages multiple clusters, there is no per user tracking for historical applications and queries across clusters. Instead, Historical Applications by User and Historical Queries by User show applications and queries per user and pool. (A pool is associated with a specific cluster.)
- Directory Usage Report needs export capability
Directory usage reports can be exported as a CSV file.
- Custom Cluster Utilization Reports
- Cloudera Manager Admin Console User Interface
- Service colors
A new set of colors is used to represent each kind of service.
- Move the table sorting icon to the right
The table sorting icon now appears consistently on the right hand side of each column.
- Improved Configuration Diff Display
Changes displayed in the configuration history page are much more user friendly. For a large section of changed text, Cloudera Manager generates a diff between the old and the new and displays the diff.
When a user changes only the password, Cloudera Manager does not show the delta: both the old and the new passwords are masked out before the comparison is performed.
- Move actions menu to the top header
The actions menu now appears next to the entity title.
-
Move Federation and High Availability to a separate page
The Federation and High Availability sections used to appear on the Instances page of an HDFS service. They have been moved to a new page called Federation and High Availability. There is a link from the existing Instances page to this new page.
-
Remove repeated heading below the second level navigation
Subtitles below the second level navigation tabs are removed because they repeated the content in the tabs.
-
Move maintenance mode and badges to the title area
Maintenance mode, staleness badges now appear next to the title of the entity.
- Express wizard allows you to add Kafka
Kafka is now listed in the custom services when you click the Add Cluster button.
- Service colors
- Cloudera Manager API
- Add update_user to Python API client
Added the update_user() method to the Python API client api_client.py.
- Expose API endpoint to add a specific path
New API endpoints have been added that allow users to add, list and remove Watched Directories in HDFS service.
- Add update_user to Python API client
- Logging
-
Include host in log file name
Kafka log4j log files now include the host name in the format kafka-broker-${host}.log. Similarly, MirrorMaker logs now include the host name in the format kafka-mirrormaker-${host}.log. Due to the log file name change, when you upgrade Cloudera Manager it no longer recognizes your old log files in log search, though they are still present on disk.
- Configuration changes to Cloudera Manager audit log
Cloudera Manager displays the History and Rollback support for the Cloudera Manager Settings. ( ). This helps you to track the changes made by an administrator so that Cloudera Support can provide better service when certain Cloudera Manager administrative settings are modified.
-
- Diagnostic Bundles
- Show the Diagnostic Bundle Redaction Policy using the redaction config
You can specify what information should be redacted in the diagnostic bundle in the UI using
.
- Show the Diagnostic Bundle Redaction Policy using the redaction config
- Upgrade
- Report that a simple restart was performed if rolling restart could not be performed
Informs you when a simple restart is performed instead of rolling restart on a service because rolling restart is not available.
- Report that a simple restart was performed if rolling restart could not be performed
- Oozie
- Provide dump / load functionality for Oozie DB
The Actions menu in the Oozie service has two new commands, Dump Database and Load Database. These commands make it easier to migrate an Oozie database to another database supported by Oozie. The Dump Database command exports Oozie's database to a file (configurable by Database Dump File setting). Load Database loads the file into a database.
- Install Oozie ShareLib permissions change
Install Oozie ShareLib Command assigns correct permissions to the uploaded libraries. This prevents breaking Oozie workflows with a custom umask setting.
- Provide dump / load functionality for Oozie DB
- Configuration Changes
- Solr zkClientTimeout option
Added the zkClientTimeout parameter for ZooKeeper.
-
Add JHIST compression as a configuration option
Added a new option for setting the file format used by an ApplicationMaster when generating the .jhist file.
- Enable heap dump by default for all daemons
Starting in version 5.9, when you configure roles that are JVM based, the Dump Heap When Out of Memory configuration parameter defaults to true. An upgrade from a pre-5.9 version maintains your pre-5.9 settings.
- Cloudera Manager support for client-side YARN graceful decommissioning
Adds the ability to perform a graceful decommission on YARN NodeManager roles whereby the Node Manager is not assigned new containers, and waits for any currently running applications to finish before being decommissioned unless a timeout occurs. You can configure the timeout using the Node Manager Graceful Decommission Timeout configuration property in the YARN Service. The default behavior has not changed, and continues to be a non-graceful decommission. Affects Cloudera Manager 5.9.0 and higher, and CDH 5.9.0 and higher.
- Deploy Client Configuration command details page now shows stdout/stderr
stdout and stderr log links are now shown in the UI when there is a failure while deploying client configurations.
- Make EXTRA_RATIO configurable for Headlamp indexing
Added the configuration parameter, Extra Space Ratio for Indexing, to Reports Manager. You can use the parameter to make the speed of indexing faster by allocating additional memory.
- Configure HBase Indexer to wait longer for ZooKeeper to come up
The default amount of time that HBase Indexer roles attempts to connect to ZooKeeper has been increased from 30 to 60 seconds. This default can be adjusted by setting a new Cloudera Manager configuration parameter, HBase Indexer ZooKeeper Session Timeout.
- Solr zkClientTimeout option
- Embedded database mode improvements
In version 5.9 and higher, Cloudera Manager can clearly identify whether or not a customer is using the embedded PostgreSQL database. Cloudera does not recommend the embedded database for production use, and requests that customers deploy production systems using an external database. The diagnostic bundles now contain information about whether or not a customer is using the embedded PostgreSQL database. Support can then reach out to customers accordingly.
If Cloudera Manager is configured to use the embedded PostgreSQL database, a yellow banner appears in the UI recommending that you upgrade to a supported external database.
- Fix CatalogServiceClient to handle TLS connections to catalogd for UDF replication
When Impala uses SSL, we now support TLS Connection to Catalog Server. Customers can enable replication for any Impala UDFs/Metadata (in Hive Replication) in Cloudera Manager 5.9 and higher.
- Do not show steps that are unreachable (skipped)
When running wizards from the Cloudera Manager Admin Console that add a cluster, add a service, perform an upgrade, and other tasks, steps do not display when they are not reachable or do not apply to the current configuration.
- Improve Cloudera Manager provisioning performance on AWS
Add support for resetting Cloudera Manager GUID/UUID. This is accomplished by checking the UUID file.
If Cloudera Manager finds the UUID file (/etc/cloudera-scm-server/uuid) and the UUID is different than the GUID in the cm_version table, it updates the GUID in the cm_version table with the contents of the UUID file and removes the UUID file.
What's New in Cloudera Manager 5.8.5
Cloudera Manager 5.8.5 is a maintenance release with many fixed issues. See Issues Fixed in Cloudera Manager 5.8.5.
What's New in Cloudera Manager 5.8.4
Cloudera Manager 5.8.4 is a maintenance release with many fixed issues. See Issues Fixed in Cloudera Manager 5.8.4.
What's New in Cloudera Manager 5.8.3
Cloudera Manager 5.8.3 is a maintenance release with many fixed issues. See Issues Fixed in Cloudera Manager 5.8.3.
What's New in Cloudera Manager 5.8.2
Cloudera Manager 5.8.2 is a maintenance release with many fixed issues. See Issues Fixed in Cloudera Manager 5.8.2.
What's New in Cloudera Manager 5.8.1
An issue has been fixed. See Issues Fixed in Cloudera Manager 5.8.1.
What's New in Cloudera Manager 5.8.0
- Operating Systems - Support for Debian 8.2.
- Resource management and utilization - Added support for nesting dynamic resource pools within a named pool at runtime.
- Backup and Disaster Recovery
- The Replication Schedules page now has a search function for finding scheduled replications.
- You can now specify a start and end time for the events that are included in manually-triggered diagnostic bundles. See Manually Triggering Collection and Transfer of Diagnostic Data to Cloudera.
- Impala
-
Impala adds a new configuration option, Use HDFS Rules to Map Kerberos Principals to Short Names. Enabling this option makes Impala pickup hadoop.security.auth_to_local configuration from HDFS configurations and uses it for Kerberos principal-to-short-name translation. This only applies for Cloudera Manager 5.8.0 and higher and CDH 5.8.0 and higher. It only affects deployments where Impala is set up to use Kerberos as the authentication mechanism. It defaults to false, to preserve the behavior from earlier CDH versions. This has no impact on upgrade.
-
Enable Impala Admission Control and Enable Dynamic Resource Pools are now enabled by default. Customized configuration values are preserved during the upgrade.
-
Impala Admission Control now supports a global method for editing the Access Control List.
-
- EMC Isilon
- Kerberos is now fully supported for replications between clusters using Isilon storage. You must configure a custom principal.
- Security
- Active Directory KDC
- Active Directory account properties are now configurable from Cloudera Manager's See Managing Active Directory Account Properties. page.
- It is now possible to use Cloudera Manager to regenerate principals for clusters using an Active Directory KDC. Cloudera Manager 5.8 includes a new configuration called Active Directory Delete Accounts on Credential Regeneration. Enabling this will allow Cloudera Manager to automatically delete existing AD accounts and complete the regeneration process. SeeEnabling Credential Regeneration for Active Directory Accounts Using Cloudera Manager .
- Cloudera Manager now allows you to configure the encryption types (or enctype) used by an Active Directory KDC to protect its data. See Configuring Encryption Types for Active Directory KDC Using Cloudera Manager.
- Redaction: In the Cloudera Manager Admin Console, Advanced Configuration Snippet parameters will now be redacted to block sensitive information such as passwords or secret keys.
- Sentry
- Cloudera Search adds support for storing permissions in the Sentry service. You can enable storing permissions in the Sentry service by Enabling the Sentry Service for Solr. If you have already configured Sentry's policy file-based approach, you can migrate existing
authorization settings as described in Migrating from Sentry Policy Files to the Sentry Service. solrctl has been extended to support:
- Migrating existing policy files to the Sentry service
- Managing managing permissions in the Sentry service
- Sentry supports data stored on Amazon S3 and can secure URIs with an S3 schema.
- Cloudera Search adds support for storing permissions in the Sentry service. You can enable storing permissions in the Sentry service by Enabling the Sentry Service for Solr. If you have already configured Sentry's policy file-based approach, you can migrate existing
authorization settings as described in Migrating from Sentry Policy Files to the Sentry Service. solrctl has been extended to support:
- Active Directory KDC
- YARN
- YARN Allowed System Users now includes hbase by default. This is helpful when running certain tools for HBase that need to execute MapReduce jobs.
What's New in Cloudera Manager 5.7.6
A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.7.6.
What's New in Cloudera Manager 5.7.2
A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.7.2.
What's New in Cloudera Manager 5.7.1
A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.7.1.
What's New in Cloudera Manager 5.7.0
- Operating Systems - Support for:
- RHEL/CentOS 6.6, 6.7, 7.1, and 7.2
- Oracle Enterprise Linux (OEL) 7.1 and 7.2
- SUSE Linux Enterprise Server (SLES) 11 with Service Packs 2, 3, 4
- Debian: Wheezy 7.0, 7.1, and 7.8
- Resource management and utilization
- Simplified and expanded resource management. The screens for YARN and Impala dynamic resource pools are now managed separately. See Dynamic Resource Pools.
- Resource pools now support the allowPreemptionFrom, minSharePreemptionTimeout, and fairSharePreemptionTimeout attributes. See Enabling and Disabling Fair Scheduler Preemption and Configuring the Fair Scheduler.
- Cluster utilization reports track usage of resources allocated using dynamic resource pools. See Cluster Utilization Reports.
- The new Directory Usage Report now shows aggregated usage information, including quotas and file sizes, which are sortable. You can also perform multiple actions on filesystem objects. See Directory Usage Report.
- Two new predicates have been added to the tsquery language: day in and hour in. These allow you to limit streams to specified days of the week and specified hours of each day, respectively. See Filtering by Day of Week or Hour of Day.
- Extensibility - For more information, see Cloudera Manager Extensions.
- Parcels are typed according to the OS version. The parcel extension indicates the version. The library for developing external parcels now supports an extension for RHEL 7.
- A new environment variable, ZK_PRINCIPAL_NAME, is now defined for CSD processes when ZooKeeper is Kerberized and has a custom principal.
- A new flag, jvmBased, is now available to CSD authors to indicate that a CSD role is JVM-based. This flag enables a set of JVM-related features in Cloudera Manager—for example, the ability to automatically generate a heap dump when an Out Of Memory error occurs.
- API
- Cloudera Manager now attempts to gracefully handle overlapping API calls.
- All distributed filesystem services (such as HDFS or Isilon) installed in a cluster can now be enumerated using the API.
- You can export the complete configuration of a CDH cluster managed by Cloudera Manager as a template, modify the template, and import the template to create a new cluster. See Creating a CDH Cluster Using a Cloudera Manager Template.
- The advanced configuration snippet editor now allows you to edit properties as name/value pairs. This is the default, however you can also choose to edit the snippet as XML.
- HBase now includes metrics and charts for replication. These charts are available in the Chart Library for each RegionServer.
- When you click the Role Log link when viewing a role, the log is opened at the current timestamp, rather than the top of the log file. This enables you to see the relevant log messages when investigating an event that occurred recently.
- A new tsquery function called counter_delta has been added to accurately compute the difference between consecutive data points for counter metrics.
- The distcp utility now supports setting records per chunk, using the distcp.dynamic.recordsPerChunk in an advanced configuration snippet to set the number of records (paths) in each chunk. When a value is set for distcp.dynamic.recordsPerChunk, other related settings, such as the maximum number of chunks tolerable, the ideal number of chunks, and the split ratio, are ignored.
- A warning is shown when upgrading with incompatible versions of Kafka and CDH. The Kafka client libraries bundled in CDH cannot communicate with an older Kafka server.
- You can override the sudo commands that Cloudera Manager agent uses by redirecting the sudo commands to a script that you write to allow or disallow certain actions. See Overriding the sudo Command.
- Hive
- Hive on Spark is now supported.
- The default execution engine for Hive can now be configured, which makes it easy to run all Hive jobs on Spark.
- HiveServer2 now has a Web UI. See Using HiveServer2 Web UI in CDH.
- Hive and HDFS replication source/target listings now work with Isilon.
- A dialog box now displays when scheduling Hive replications, reminding you to take snapshots of the Hive warehouse directory. See Hive/Impala Replication with Snapshots.
- The Direct SQL option is now enabled in Hive Metastore.
- When upgrading to CDH 5.7, if Hive is configured to use YARN, all Hive on Spark parameters are automatically tuned to recommended values. If Hive on Spark was previously tuned, this is skipped.
A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.7.0.
What's New in Cloudera Manager 5.6.1
A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.6.1.
What's New in Cloudera Manager 5.6.0
What's New in Cloudera Manager 5.5.4
A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.5.4.
What's New in Cloudera Manager 5.5.3
An issue has been fixed. See Issues Fixed in Cloudera Manager 5.5.3.
What's New in Cloudera Manager 5.5.2
-
New Impala flags added for web server certificate files and passwords. This adds support for the --webserver_private_key_file and --webserver_private_key_password_cmd flags for the Impala Daemon, the Impala Catalog Server, and the Impala StateStore roles.
A number of issues have also been fixed. See Issues Fixed in Cloudera Manager 5.5.2.
What's New in Cloudera Manager 5.5.1
An issue has been fixed. See Issues Fixed in Cloudera Manager 5.5.1.
What's New in Cloudera Manager 5.5.0
- Operating Systems - Support for RHEL/CentOS 6.6 (in SE Linux mode), 6.7, and 7.1, and Oracle Enterprise Linux 7.1.
- Backup and Disaster Recovery (BDR) - You can use the BDR UI or API to replicate data between encrypted clusters
- Databases - Supports MariaDB 5.5, Oracle 12c, and PostgreSQL 9.4.
- Selective service restart after activating parcels is supported.
- Retrying upgrade actions is supported. If a cluster upgrade command fails while in progress, you can retry a command after fixing the cause of failure. On retry, the command restarts from the command step where it failed.
- The command details page for running and recent commands has been redesigned for usability and scalability.
- Instead of serially starting all services for the first time, services that are not dependent are started in parallel. This decreases the time required to start services for the first time after creating a cluster.
- Performance has improved for service startup, client configuration deployment, and calculation of stale configurations.
- Suppression of notifications
- You can suppress the warnings that Cloudera Manager issues when a configuration value is outside the recommended range or is invalid. See Suppressing Configuration and Parameter Validation Warnings.
- You can suppress health test warnings. See Suppressing Health Test Results.
Suppression can be useful if a warning does not apply to your deployment and you no longer want to see the notification. Suppressed warnings are still retained by Cloudera Manager, and you can unsuppress the warnings at any time.
- Multi Cloudera Manager Dashboard - A special mode of Cloudera Manager that enables you to view monitoring data aggregated from multiple Cloudera Manager instances that manage one or more CDH clusters. See Monitoring Multiple CDH Deployments Using the Multi Cloudera Manager Dashboard.
- You can decommission roles when services are completely stopped. This allows you to decommission hosts during cluster downtime.
- You can disable collection of certain domain metrics—for example, for HBase RegionServers, Kafka Brokers, and others—through new settings in the host advanced configuration snippet. This is useful in certain support situations and should only be done under the direction of Cloudera Support.
- You can configure which aggregate metrics are automatically generated. This advanced feature can be useful in certain situations to impact the monitoring workload, allowing unused or less-important aggregate metrics to be skipped. This may result in improved performance and the ability to handle larger monitoring workloads, or to retain data for a larger workload for longer. Cloudera recommends using this only under the direction of Cloudera Support.
- Alert Publisher can be configured to pass alert events to a user-defined script. Use this for integrating with other alerting systems or for custom logic (for example, to send some alerts to some people and others to other people).
- Agent minor version mismatches (5.4 to 5.5) now cause bad host health. Maintenance version mismatches (for example, 5.4.x to 5.4.y) still cause concerning host health.
- Cloudera Manager indicates if the Java version in use is too old.
- Cloudera Manager indicates if the supervisor component of the Agent needs to be restarted after an upgrade.
- Full and User Administrators can view active user sessions. See Viewing User Sessions.
- Full Administrators and Auditors can audit failed and successful logins.
- Multiple user session logins can be disallowed.
- You can configure external authentication so that local administrator emergency access is disabled. This means that no local accounts can log in under any circumstances, including when the external system is not functioning.
- You can turn on authentication for the URLs for downloading client configuration zip files. Previously, authentication was never required.
- Passwords are no longer accessible in cleartext through the Cloudera Manager UI or in the configuration files stored on disk. See Cloudera Manager and Passwords. There are some exceptions; see Known Issues and Workarounds in Cloudera Manager 5.
- HBase
- Use a configuration option in HBase to skip region reload during rolling restart and rolling upgrade, to increase the speed of the operations.
- HBase rolling restart performance can be improved by increasing the number of Region Mover Threads. If the value of this property is 1, it can lower rolling restart speed. The Admin Console now displays this information and, if the value is 1, advises increasing it.
- HBase Thrift Server and Rest Server support TLS/SSL.
- HDFS
- HDFS encryption can be enabled using a wizard. See Enabling HDFS Encryption Using the Wizard.
- Exposes AES as an encryption option for HDFS RPC encryption.
- Hive
- Hive can use TLS/SSL and Kerberos at the same time.
- When Hive is configured to use TLS/SSL, Hue is automatically configured to use that protocol when communicating with Hive. Similarly, when Impala is configured to use TLS/SSL, Hue is automatically configured to use that protocol when communicating with Impala.
- HiveServer2 supports a timeout value for idle sessions and operations. By default, it times out client sessions after a week and idle operations after three days. This helps alleviate problems with long-running sessions when using Hue.
- Cloudera Manager collects and displays various operational metrics for Hive.
- Hue
- Hue supports a Load Balancer role using HTTPD as a load balancer.
- You can configure certificates trusted by Hue using the TLS/SSL Truststore configuration. This replaces the REQUESTS_CA_BUNDLE advanced configuration snippet entry.
- You can specify a password that protects the Hue private key file.
- Cloudera Manager collects and displays various operational metrics for Hue. New health tests have been added for Hue as well.
- Impala supports TLS/SSL internally between the StateStore and the Catalog Server roles as well as Impala Daemon.
- Kafka
- Kafka supports rolling restart.
- Kafka displays additional broker metrics.
- Kafka exposes additional commonly configured parameters.
- Existing Kafka parameter definitions have updated descriptions, default values, and validation settings.
- The Kafka broker instance list now shows which broker is the active controller.
- Key Trustee
- The Key Trustee Server CSD is included in Cloudera Manager. Manual installation of the Key Trustee Server CSD is not required.
- A Key Administrator role in Cloudera Manager is used for configuring HDFS Data at Rest Encryption. Only a Key Administrator and a Full Administrator can make configuration changes to Java Keystore KMS, Key Trustee KMS, and Key Trustee Server. Configuring HDFS to use Data at Rest Encryption is also limited to the Key Administrator and Full Administrator roles. This allows organizations to keep Key Administrators and Cluster Administrators separate, which is a security best practice.
- When running Key Trustee KMS in a highly available configuration, Cloudera Manager can automatically generate the load balancer URL.
- Sentry
- Sentry introduces column-level access control for tables in Hive and Impala. Previously, Sentry supported privilege granularity only at the table level. You can now assign the SELECT privilege on a subset of columns in a table. See Hive SQL Syntax for Use with Sentry.
- Sentry supports Kerberos authentication for the Sentry web server.
- Solr
- Solr can be configured with a load balancer in a secure environment.
- There is a new Solr Max Connector Threads property for Solr Server in CDH 5.1.0 and higher.
- Solr supports LDAP/AD authentication.
- Backup and Disaster Recovery
- The user interface for scheduling and reviewing replications and snapshots has been improved. You can now view the history of replication jobs and subtasks more easily. See Viewing Replication History.
- When specifying an HDFS replication job, you can apply exclusion filters to exclude specific files or directories. See Configuring Replication of HDFS Data.
- You can download or send to Cloudera Support a diagnostic bundle to troubleshoot replication jobs. Bundles include logs of the replication run. See Viewing Replication Schedules.
- The performance of the file-listing phase of a replication job has been improved.
- The performance of the initialization and running phase has been improved.
- The following advanced configuration snippets for configuring replications have been added:
- HDFS Replication Advanced Configuration Snippet (Safety Valve) for hadoop-env.sh
- Hive Replication Advanced Configuration Snippet (Safety Valve) for hive-site.xml
- HDFS Replication Advanced Configuration Snippet (Safety Valve) for yarn-site.xml
- HDFS Replication Advanced Configuration Snippet (Safety Valve) for mapred-site.xml
- Snapshot properties for HBase such as thread pool size can be configured in the HBase Client Advanced Configuration Snippet (Safety Valve) for hbase-site.xml property.
- Hive partitions are chunked during export and import to avoid message size limitations.
- Hive replications validate metadata on the destination Hive Metastore before copying HDFS data from the source to avoid copying errors during replication.
- The use of snapshots to improve replications is documented. See Using Snapshots with Replication.
- The effect of network latency on replications is documented. See Network Latency and Replication.
- Scheduled snapshots can be disabled and re-enabled.
- API improvements:
- Explicit support for pausing snapshot policies
- Failed file listing
- Collection of diagnostic bundles for replication schedules and history
What's New in Cloudera Manager 5.4.10
A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.4.10.
What's New in Cloudera Manager 5.4.9
A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.4.9.
What's New in Cloudera Manager 5.4.8
New ability to decommission hosts with stopped services
Adds ability to decommission roles when services are completely stopped. This allows users to decommission hosts during cluster downtime.
A number of issues have also been fixed. See Issues Fixed in Cloudera Manager 5.4.8.
What's New in Cloudera Manager 5.4.7
New service-level advanced configuration snippets for Solr
- Solr Service Advanced Configuration Snippet (Safety Valve) for core-site.xml
- Solr Service Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml
A number of issues have also been fixed. See Issues Fixed in Cloudera Manager 5.4.7.
What's New in Cloudera Manager 5.4.6
An issue has been fixed. See Issues Fixed in Cloudera Manager 5.4.6.
What's New in Cloudera Manager 5.4.5
A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.4.5.
What's New in Cloudera Manager 5.4.3
- Rolling back a CDH 4 to CDH 5 upgrade is now supported using Cloudera Manager. See Rolling Back a CDH 4-to-CDH 5 Upgrade.
A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.4.3.
What's New in Cloudera Manager 5.4.1
- The Cloudera Manager Express and Add Service wizards allow you to add a Hue service with multiple Hue Server roles. For Kerberized clusters, the Add Service wizard automatically adds a collocated Kerberos Ticket Renewer role for each Hue Server role instance.
- When Kerberos is enabled, Cloudera Manager now checks to ensure each Hue Server role is collocated with a Kerberos Ticket Renewer role. If you forget to add a Kerberos Ticket Renewer role when adding a new Hue Server role, a configuration error is generated.
- High availability for Cloudera Manager is now supported for 5.4. See Configuring Cloudera Manager for High Availibility with a Load Balancer.
A number of issues have also been fixed. See Issues Fixed in Cloudera Manager 5.4.1.
What's New in Cloudera Manager 5.4.0
- OS - Added support for RHEL 6.6 and CentOS 6.6.
- Cloudera Manager prevents installing or upgrading to a CDH version that is too new for the Cloudera Manager version. When using parcels, it prevents parcel installation. When using packages, it prevents creating services.
- Installation and add service wizards now support the Oozie database.
- New wizard for NameNode, Failover Controller, and JournalNode role migration.
- Parcel page layout redesigned in terms of layout, performance and ease of use. A new parcel per host detail view is added.
- Configuration
- Configuration pages use the new layout by default. The new layout is dramatically improved in terms of layout, performance, and ease of use. The existing layout is accessible via the Switch to the classic layout link.
- New configuration actions:
- Configuration can now be applied to all clusters as well as for a specific cluster.
- Several new configuration views have been added to show all non-default values across all clusters and the Cloudera Management Service, as well as differences across all clusters and multiple services of the same type.
- One-click differences in configuration settings for a specific service across multiple clusters.
- Support
- Include a Cloudera support ticket with YARN application support bundles.
- Reduce the size of support bundles by specifying log data of interest to include in the bundle.
- HDFS
- Support for HDFS DataNode hot swap.
- Option to include replication of extended attributes during HDFS replication. HDFS ACLs will now be replicated along with permissions.
- Added support for Hive on Spark. For more information, see Running Apache Hive on Spark in CDH.
- Security
- Secure impersonation support for the Hue HBase app.
- Redaction of sensitive data in log files and in SQL query history.
- Support for custom Kerberos principals.
- Added commands for regenerating Kerberos keytabs at service and host levels. These commands will clear existing keytabs from affected role instances and then trigger the Generate Credentials command to create new keytabs.
- Kerberos support for Sqoop 2.
- Kerberos and TLS/SSL support for Flume Thrift Source and Sink.
- Solr TLS/SSL support.
- Navigator Key Trustee Server can be installed and monitored by Cloudera Manager.
- HBase Indexer integration with Sentry (File-based) for authorization.
What's New in Cloudera Manager 5.3.10
A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.3.10.
What's New in Cloudera Manager 5.3.9
A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.3.9.
What's New in Cloudera Manager 5.3.8
A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.3.8.
What's New In Cloudera Manager 5.3.7
An issue has been fixed. See Issues Fixed in Cloudera Manager 5.3.7.
What's New in Cloudera Manager 5.3.6
A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.3.6.
What's New in Cloudera Manager 5.3.4
A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.3.4.
What's New in Cloudera Manager 5.3.3
A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.3.3.
What's New in Cloudera Manager 5.3.2
A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.3.2.
What's New in Cloudera Manager 5.3.1
A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.3.1.
What's New in Cloudera Manager 5.3.0
- JDK 1.8 - Cloudera Manager adds support for Oracle JDK 1.8.
- Single user mode - The Cloudera Manager Agent and all service processes can now be run as a single configured user in environments where running as root is not permitted. See Configuring Single User Mode.
- CDH upgrade wizard enhanced - The CDH upgrade wizard now supports minor and maintenance version upgrade as well as major version upgrade.
- Oozie Sharelib - The Oozie Sharelib can be updated without restarting the Oozie service.
- Read-only users prevented from viewing process logs or environment - Read-only users can no longer view the environment or logs of a process. This is to prevent read-only users from seeing potentially sensitive information.
- New icons for the KMS and Key Trustee services.
- Data-at-rest encryption
HDFS encryption implements transparent, end-to-end encryption of data read from and written to HDFS by creating encryption zones. An encryption zone is a directory in HDFS with every file and
subdirectory in it encrypted. Use one of the following services to store, manage, and access encryption zone keys:
- KMS (File) - The Hadoop Key Management Server with a file-based Java keystore; maintains a single copy of keys, using simple password-based protection.
- KMS (Navigator Key Trustee) - An enterprise-grade key management service that replaces the file-based Java keystore and leverages the advanced key-management capabilities of Cloudera Navigator Key Trustee. Navigator Key Trustee is designed for secure, authenticated administration and cryptographically strong storage of keys on multiple redundant servers that can be located outside the cluster.
- The Cloudera Manager Server now reports the correct number of physical cores and hyper-threading cores if hyper-threading is enabled.
- Client configurations - Client configurations are now managed so that they are redeployed when a machine is re-imaged.
- Configuration
- NameNode configuration - The decommissioning parameters dfs.namenode.replication.max-streams and dfs.namenode.replication.max-streams-hard-limit are now available.
- Hue debug options - Two service-level configuration parameters have been added to the Hue service to enable Django debug mode and debugging of internal server error responses.
What's New in Cloudera Manager 5.2.7
An issue has been fixed. See Issues Fixed in Cloudera Manager 5.2.7.
What's New in Cloudera Manager 5.2.6
A number of issues have been fixed, see Issues Fixed in Cloudera Manager 5.2.6.
What's New in Cloudera Manager 5.2.5
A number of issues have been fixed, see Issues Fixed in Cloudera Manager 5.2.5.
What's New in Cloudera Manager 5.2.4
There are no changes for Cloudera Manager 5.2.4. It was released to provide the Cloudera Navigator fix in .
What's New in Cloudera Manager 5.2.2
- HDFS Decommissioning - The following decommissioning properties have been exposed in Cloudera Manager 5.2.2.
- Maximum number of replication threads on a Datanode (dfs.namenode.replication.max-streams)
- Hard limit on the number of replication threads on a Datanode (dfs.namenode.replication.max-streams-hard-limit)
- New icons for the KMS and Key Trustee services.
What's New in Cloudera Manager 5.2.1
- The YARN yarn.nodemanager.recovery.dir property can be configured.
- A health check indicates whether the HDFS metadata upgrade has not been finalized.
What's New in Cloudera Manager 5.2.0
- OS and database support - Adds support for Ubuntu Trusty (version 14.04) and PostgreSQL 9.3.
- Services - the following new services have been added:
- Isilon - supports the EMC Isilon distributed filesystem.
- KMS - the Java keystore-based key management server.
- Key Trustee - the enterprise-grade key management server using Cloudera Navigator Key Trustee.
- Spark - running Spark applications on YARN. The existing Spark service has been renamed Spark (Standalone).
- Accumulo - Kerberos authentication is now supported. If you have been using advanced configuration snippets (safety valves) to configure Kerberos with Accumulo, you may now remove those settings and have Cloudera Manager generate the principal and keytab file for you.
- HDFS Data at Rest Encryption -
- HBase - Support for configuring hedged reads has been added for HBase. The default configuration is to turn hedged reads off. Cloudera Manager will emit two properties, dfs.client.hedged.read.threadpool.size (default: 0) and dfs.client.hedged.read.threshold.millis (default: 500ms) to hbase-site.xml. For more information, see Hedged Reads .
- ZooKeeper - the RMI port can be configured. The port is configured using the JDK7 flag -Dcom.sun.management.jmxremote.rmi.port. The default value is set to be same as the JMX Agent port. Also, a special value of 0 or -1 disables the setting and a random port is used. The configuration has no effect on versions lower than Oracle JDK 7u4.
- Cloudera Manager Agent configuration
- The supervisord port can now be configured in the Agent configuration supervisord_port. The change takes effect the next time supervisord is restarted (not simply when the Agent is restarted).
- Added an Agent configuration local_filesystem_whitelist that allows configuring the list of local filesystems that should always be monitored.
- Proxy user configuration
- All services' proxy user configuration properties have been moved to the HDFS service. Other services running on the cluster inherit the configuration values provided in HDFS. If you
have previously configured a service to have values different from those configured in HDFS, then the proxy user configuration properties will be moved to that service's Advanced Configuration
Snippet (Safety Valve) for core-site.xml to retain existing behavior.
Oozie and Solr are exceptions to this. Oozie proxy user configuration properties have been moved to Oozie Server Advanced Configuration Snippet (Safety Valve) for oozie-site.xml if they differ from HDFS. Solr proxy user configuration properties have been moved to Solr Service Environment Advanced Configuration Snippet (Safety Valve) if they differ from HDFS.
- All services' proxy user configuration properties have been moved to the HDFS service. Other services running on the cluster inherit the configuration values provided in HDFS. If you
have previously configured a service to have values different from those configured in HDFS, then the proxy user configuration properties will be moved to that service's Advanced Configuration
Snippet (Safety Valve) for core-site.xml to retain existing behavior.
- Resource management - YARN and Llama integrated resource management and Llama high availability wizard.
- New and changed user roles - BDR Administrator, Cluster Administrator, Navigator Administrator, and User Administrator. The Administrator role has been renamed Full Administrator. See Cloudera Manager User Accounts.
- Configuration UI
- Cluster-wide configuration - you can view all modified settings and configure log directories, disk space thresholds, and port settings.
- New configuration layout - the new layout provides an alternate way to view configuration pages. In the classic layout, pages are organized by role group and categories within the role groups. The new layout allows you to filter on configuration status, category, and scope. On each configuration page you can easily switch between the classic and new layout.
What's New in Cloudera Manager 5.1.6
A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.1.6.
What's New in Cloudera Manager 5.1.5
A number of issue have been fixed. See Fixed Issues in Cloudera Manager 5.1.5.
What's New in Cloudera Manager 5.1.4
A number of issues have been fixed. See Fixed Issues in Cloudera Manager 5.1.4.
What's New in Cloudera Manager 5.1.3
A number of issues have been fixed. See Fixed Issues in Cloudera Manager 5.1.3.
- JDK Installation
- Users who are adding or upgrading hosts can now choose not to install the JDK that ships with Cloudera Manager.
What's New in Cloudera Manager 5.1.2
A number of issues have been fixed. See Fixed Issues in Cloudera Manager 5.1.2.
- New SAML configuration option
- You can now specify the binding protocol to be used for AuthNResponses sent from the IDP to Cloudera Manager. Previously, Cloudera Manager would only use HTTP-Artifact, but it is now possible to choose HTTP-Post. HTTP-Artifact remains the default binding.
What's New in Cloudera Manager 5.1.1
An issue has been fixed. See Issues Fixed in Cloudera Manager 5.1.1.
What's New in Cloudera Manager 5.1.0
- SSL Encryption
- Supports several new SSL-related configuration parameters for HDFS, MapReduce, YARN and HBase, which allow you to configure and enable encrypted shuffle and encrypted web UIs for these services. See Configuring TLS/SSL Encryption for CDH Services.
- Cloudera Manager now also supports the monitoring of HDFS, MapReduce, YARN, and HBase when SSL is enabled for these services. New configuration parameters allow you to specify the location and password of the truststore used to verify certificates in HTTPS communication with CDH services and the Cloudera Manager Server.
- Sentry Service
- A new Sentry service that stores the authorization metadata in an underlying relational database and allows you to use Grant/Revoke statements to modify privileges. See The Sentry Service.
- You can also configure the Sentry service to allow Pig, MapReduce, and WebHCat queries access to Sentry-secured data stored in Hive. See Configuring Pig and HCatalog for the Sentry Service.
- Kerberos Authentication
- Now supports a Kerberos cluster using an Active Directory KDC.
- New wizard to enable Kerberos on an existing cluster. The wizard works with both MIT KDC and Active Directory KDC.
- Ability to configure and deploy Kerberos client configuration (krb5.conf) on a cluster.
- Spark Service - added the History Server role
- Impala - added support for Llama ApplicationMaster High Availability
- User Roles - there are two new roles: Operator and Configurator that support fine-grained access to Cloudera Manager features. See Cloudera Manager User Accounts.
- Monitoring
- Updates to Oozie monitoring
- New Hive metastore canary
- UI - The UI has been updated to improve scalability. The tab can be configured to display clusters in a full or summary format. There is a new Cluster page for each cluster. The Hosts and Instances pages have added faceted filters.
What's New In Cloudera Manager 5.0.7
A number of issues have been fixed. See Fixed Issues in Cloudera Manager 5.0.7.
What's New in Cloudera Manager 5.0.6
A number of issues have been fixed. See Fixed Issues in Cloudera Manager 5.0.6.
What's New in Cloudera Manager 5.0.5
A number of issues have been fixed. See Fixed Issues in Cloudera Manager 5.0.5.
What's New in Cloudera Manager 5.0.2
A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.0.2.
What's New in Cloudera Manager 5.0.1
A number of issues have been fixed. See Issues Fixed in Cloudera Manager 5.0.1.
- Monitoring
- The Java Garbage Collection Duration health test for the Service Monitor, Host Monitor, and Activity Monitor has been replaced with the new Java Pause Duration health test.
What's New in Cloudera Manager 5.0.0
- Service and Configuration Management
- HDFS - cache management
- Resource Management - Impala admission control
- Monitoring
- Host disks overview
- Impala best practices
- HBase table statistics
- HDFS cache statistics
What's New in Cloudera Manager 5.0.0 Beta 2
- Service and Configuration Management
- HDFS
- HDFS NFS Gateway role
- Supports restoration of HDFS data from a snapshot
- YARN
- YARN Resource Manager High Availability
- Resource pool scheduler
- Support for Spark service
- Support for Accumulo service
- Support for service extensibility
- Support to set up Oozie server High Availability
- Granular configuration staleness UI
- Support for setting maximum file descriptors
- HDFS
- Monitoring
- Support for monitoring the Cloudera Search/Solr service
- New "failed" and "killed" badges displayed for unsuccessful YARN applications
- More attributes available for filtering displays of YARN applications and Impala queries
- New operational reports added for HBase tables and namespaces, Impala queries, and YARN applications
- Support for creating user-defined triggers for metrics accessible via charts/tsquery
- Charting improvements
- New table chart type
- New options for displaying data and metadata from charts
- Support for exporting data from charts to CSV or JSON files
- Administrative Settings
- Added a new role type with limited administrator capabilities.
- Cloudera Manager Server and all JVMs will create a heap dump if they run out of memory.
- Configure the location of the parcel directory and specify whether and when to remove old parcels from cluster hosts.
What's New in Cloudera Manager 5.0.0 Beta 1
- CDH Version
- Supports both CDH 4 and CDH 5
- CDH 4 to CDH 5 upgrade wizard
- Support for YARN as a production execution environment
- MapReduce (MRv1) to YARN (MRv2) configuration import
- YARN-based resource management for Impala 1.2
- JDK Version - Cloudera Manager 5 supports and installs both JDK 6 and JDK 7.
- Resource Management
- Static and dynamic partitioning of resources: provides a wizard for configuring static partitioning of resources (cgroups) across core services (HBase, HDFS, MapReduce, Solr, YARN) and dynamic allocation of resources for YARN and Impala.
- Pool, resource group, and queue administration for YARN and Impala.
- Usage monitoring and trending.
- Monitoring
- YARN service monitoring
- YARN (MRv2) job monitoring
- Configurable histograms of Impala query and YARN job attributes that can be used to quickly filter query and application lists
- Scalable back-end database for monitoring metrics
- Charting improvements
- New chart types: histogram and heatmap
- New scale types: logarithmic and power
- Updates to tsquery language: new attribute values to support YARN and new functions to support new chart types
- Extensibility
- Ability to manage both ISV applications and non-CDH services (for example, Accumulo, Spark, and so on)
- Working with select ISVs as part of Beta 1
- Single Sign-On - Support for SAML to enable single sign-on
- Parcels
- Dependency enforcement to ensure incompatible parcels are not used together
- Option to not cache downloaded parcels, to save disk space
- Improved error reporting for management operations
- Backup and Disaster Recovery (BDR)
- HBase and HDFS snapshots: Supports scheduling snapshots on a recurring basis.
- Support for YARN (MRv2): Replication jobs can now run using YARN (MRv2) instead of MRv1.
- Global replication page: All scheduled snapshots (HDFS and HBase) and replication jobs for either HDFS or Hive are shown on a single Replications page.
- Other
- Global Search box
- Several usability improvements
- Comprehensive detection of configuration changes that require service restarts, refresh and redeployment of client configurations
Incompatible Changes in Cloudera Manager 5
The following sections describe incompatible changes in each Cloudera Manager 5 release.
- Incompatible Changes Introduced in Cloudera Manager 5.15.0
- Incompatible Changes Introduced in Cloudera Manager 5.5.0
- Incompatible Changes Introduced in Cloudera Manager 5.4.0
- Incompatible Changes Introduced in Cloudera Manager 5.3.0
- Incompatible Changes Introduced in Cloudera Manager 5.2.0
- Incompatible Changes Introduced in Cloudera Manager 5.1.0
- Incompatible Changes Introduced in Cloudera Manager 5.0.0
- Incompatible Changes Introduced in Cloudera Manager 5.0.0 Beta 2
- Incompatible Changes Introduced in Cloudera Manager 5.0.0 Beta 1
Incompatible Changes Introduced in Cloudera Manager 5.15.0
- Oozie Load Balancer Configuration Parameter -
The oozie_load_balancer Cloudera Manager configuration parameter has been changed. Previously, it was specified as <hostname>:<port> format. In Cloudera Manager 5.15 and later, the format is <hostname>. This format change is incompatible.
Any client reading this value through the API should also read as necessary the load balancer port configuration parameter: oozie_load_balancer_http_port or oozie_load_balancer_https_port. The correct port parameter depends on whether TLS/SSL is enabled. Check the oozie_use_ssl property if you are unsure.
Incompatible Changes Introduced in Cloudera Manager 5.5.0
- Cloudera Manager no longer supports JDK 1.6.
Incompatible Changes Introduced in Cloudera Manager 5.4.0
- The Blacklisted Products property has been removed from the Hosts > Parcels configuration.
Incompatible Changes Introduced in Cloudera Manager 5.3.0
- Oozie metrics - The Oozie metrics framework is now controlled by the Enable The Metrics Instrumentation Service flag, which is enabled by default. When enabled, the old 'instrumentation' REST end-point is disabled and metrics are available on the new 'metrics' REST end-point (hostname:port/v2/admin/metrics).
Incompatible Changes Introduced in Cloudera Manager 5.2.0
- Due to various internal changes to configuration generation, all service and client configurations will be stale after upgrade. To propagate the updates, restart the cluster and redeploy client configurations.
Incompatible Changes Introduced in Cloudera Manager 5.1.0
- The Limited Administrator role has been renamed Limited Operator. The Limited Operator role is no longer available in Cloudera Manager Express. If you upgrade a Cloudera Manager Express installation, users in the Limited Operator role will not be able to log in. A user in the Administrator role must assign the Read-Only or Administrator role to those users.
Incompatible Changes Introduced in Cloudera Manager 5.0.0
- Cloudera Manager API
- New upgradeCdh command, which upgrades CDH cluster versions. Use this command to upgrade clusters from CDH 4 to CDH 5. The upgradeServices command previously used to upgrade CDH cluster versions is no longer supported.
- The hostId field now contains a unique UUID and no longer matches the hostName field. When referring to a host, both hostId and hostName are accepted. However, any API clients that were previously cross-referencing host records with external information by hostName, but were using the hostId field in the API, must be updated to use the hostName field. Clients updated in this manner will function correctly with older versions of Cloudera Manager because the hostName field has always been present.
- The clusterName field displayed when viewing service and role references is now an internal name and may not match the external displayNamefield of the cluster.
- All CDH 5 versions of Hue work only with the default system Python version of the operating system it is being installed on. For example, on RHEL/CentOS 6, you need Python 2.6 to start Hue.
- Cloudera Manager 5.0 includes a change to the value of the snmpTrapOID. Earlier releases set the value of snmpTrapOID (OID: .1.3.6.1.6.3.1.1.4.1.0) wrongly to clouderaManagerMIBNotifications (OID .1.3.6.1.4.1.38374.1.1.1). This is fixed in Cloudera Manager 5.0 with the correct value, which is clouderaManagerAlert (OID .1.3.6.1.4.1.38374.1.1.1.1). This change will break SNMP server setups that are configured to expect clouderaManagerMIBNotifications. Cloudera Manager administrators should configure their SNMP receivers to accept the corrected OID.
- The default values for the following configurations have changed to include the JVM option -Djava.net.preferIPv4Stack=true, which sets the preferred
protocol stack to IPv4 on dual-stack machines. Any values set to the old defaults will automatically be changed to the new default when upgrading to Cloudera Manager 5.
- MapReduce client configuration:
- hadoop-env.sh: added to HADOOP_CLIENT_OPTS
- mapred-site.xml: added to mapred.child.java.opts
- YARN client configuration:
- hadoop-env.sh: added to YARN_OPTS
- mapred-site.xml: added to yarn.app.mapreduce.am.command-opts, mapreduce.map.java.opts, and mapreduce.reduce.java.opts
- HDFS client configuration: hadoop-env.sh: added to HADOOP_CLIENT_OPTS
- Hive client configuration: hive-env.sh: added to HADOOP_CLIENT_OPTS
- MapReduce client configuration:
- MapReduce health tests have been removed:
- Job failure
- Map backlog
- Reduce backlog
- Map locality
- Looks at all the jobs that completed in the last hour and if there are more than 10% of failed jobs, change the health of the service to concerning:
IF (select (jobs_failed_rate * 3600) as jobs_failed, ((jobs_failed_rate + jobs_completed_rate + jobs_killed_rate) * 3600) as all_jobs where roleType=JOBTRACKER AND serviceName=$SERVICENAME and last(jobs_failed_rate / (jobs_failed_rate + jobs_completed_rate + jobs_killed_rate)) >= 10 ending at $END_TIME duration "PT3600S") DO health:concerning
- If there are more than 50% maps waiting than total slots available, health goes concerning.
IF (select waiting_maps / map_slots where roleType=JOBTRACKER and serviceName=$SERVICENAME and last(waiting_maps / map_slots) > 50) DO health:concerning
- If there are more than 50% reduce waiting than total slots available, health goes concerning.
IF (select waiting_reduces / reduce_slots where roleType=JOBTRACKER and serviceName=$SERVICENAME and last(waiting_reduces / reduce_slots) > 50) DO health:concerning
- HDFS checkpointing metrics have been removed:
- end_checkpoint_num_ops
- end_checkpoint_avg_time
- start_checkpoint_num_ops
- start_checkpoint_avg_time
Incompatible Changes Introduced in Cloudera Manager 5.0.0 Beta 2
- Impala releases earlier than 1.2.1 are no longer supported.
- Some of the constants identifying health tests have changed. The following existed in Cloudera Manager 4:
- FAILOVERCONTROLLER_FILE_DESCRIPTOR
- FAILOVERCONTROLLER_HOST_HEALTH
- FAILOVERCONTROLLER_LOG_DIRECTORY_FREE_SPACE
- FAILOVERCONTROLLER_SCM_HEALTH
- FAILOVERCONTROLLER_UNEXPECTED_EXITS
They are now:
- MAPREDUCE_FAILOVERCONTROLLER_FILE_DESCRIPTOR
- MAPREDUCE_FAILOVERCONTROLLER_HOST_HEALTH
- MAPREDUCE_FAILOVERCONTROLLER_LOG_DIRECTORY_FREE_SPACE
- MAPREDUCE_FAILOVERCONTROLLER_SCM_HEALTH
- MAPREDUCE_FAILOVERCONTROLLER_UNEXPECTED_EXITS
and
- HDFS_FAILOVERCONTROLLER_FILE_DESCRIPTOR
- HDFS_FAILOVERCONTROLLER_HOST_HEALTH
- HDFS_FAILOVERCONTROLLER_LOG_DIRECTORY_FREE_SPACE
- HDFS_FAILOVERCONTROLLER_SCM_HEALTH
- HDFS_FAILOVERCONTROLLER_UNEXPECTED_EXITS
The reason for the change is to better distinguish between MapReduce and HDFS failover controller monitoring in the health system.
Incompatible Changes Introduced in Cloudera Manager 5.0.0 Beta 1
- Services
- Impala - With Cloudera Manager 4.8 (released in late November 2013), only Impala 1.2.1 is supported, due to the introduction of the Impala Catalog Server. However, CDH 5.0.0 Beta 1 was released with Impala 1.2.0 (Beta). Therefore, if you upgrade from Cloudera Manager 4.8 (with Impala 1.2.1) to Cloudera Manager 5.0.0 Beta 1, and then upgrade your CDH to CDH 5.0.0 Beta 1, your version of Impala will be downgraded to Impala 1.2.0 from 1.2.1. This will result in some loss of functionality. See New Features in Impala for a list of the new features in Impala 1.2.1 that are not in Impala 1.2.0 (Beta).
- Hive - HiveServer 2 is a mandatory role for Hive in CDH 5.
- Hue - In CDH 5, Hue no longer has a Beeswax Server role. Hue now submits queries to HiveServer2.
- HDFS - Cloudera Manager 5 does not support NFS-mounted shared edits directories for HDFS High Availability. It only supports the Quorum Journal method for shared edits. If you upgrade from Cloudera Manager 4 with a working CDH 4 High Availability configuration that uses NFS-mounted directories, your installation will continue to work until you disable High Availability. You will not be able to re-enable High Availability with NFS-mounted directories. Furthermore, you will not be able to upgrade to CDH 5 unless you disable High Availability, and you will need to use Quorum-based storage in order to re-enable High Availability after the upgrade.
- YARN
- The YARN (MRv2) configuration mapreduce.job.userlog.retain.hours has been replaced by yarn.log-aggregation.retain-seconds. Any existing value in mapreduce.job.userlog.retain.hours will be lost. However, this configuration never had any effect, so no functionality is affected.
- The following configuration parameters were removed from YARN. These never had any effect, so no functionality is affected.
- mapreduce.jobtracker.maxtasks.perjob
- mapreduce.jobtracker.handler.count (non-functional duplicate of yarn.resourcemanager.resource-tracker.client.thread-count)
- mapreduce.jobtracker.persist.jobstatus.active
- mapreduce.jobtracker.persist.jobstatus.hours
- mapreduce.job.jvm.numtasks
- The following YARN configuration parameters were replaced. Only the YARN parameters were replaced. Old configurations will be lost, but they never had any effect so this does not
affect functionality.
- mapreduce.jobtracker.restart.recover replaced by yarn.resourcemanager.recovery.enabled (changed from Gateway to ResourceManager)
- mapreduce.tasktracker.http.threads replaced by mapreduce.shuffle.max.connections
- mapreduce.jobtracker.staging.root.dir replaced by yarn.app.mapreduce.am.staging-dir
- Cloudera Manager 5 sets the default YARN Resource Scheduler to FairScheduler. If a cluster was previously running YARN with the FIFO scheduler, it will be changed to FairScheduler the next time YARN restarts. The FairScheduler is only supported with CDH 4.2.1 and later, and older clusters may hit failures and need to manually change the scheduler to FIFO or CapacityScheduler. See the Known Issues section of this Release Note for information on how to change the scheduler back to FIFO or CapacityScheduler.
Changed Features and Behaviors in Cloudera Manager 5
The following sections describe what's changed in each Cloudera Manager 5 release.
- What's Changed in Cloudera Manager 5.16
- What's Changed in Cloudera Manager 5.15
- What's Changed in Cloudera Manager 5.14
- What's Changed in Cloudera Manager 5.13
- What's Changed in Cloudera Manager 5.12
- What's Changed in Cloudera Manager 5.11
- What's Changed in Cloudera Manager 5.10
- What's Changed in Cloudera Manager 5.9.0
- What's Changed in Cloudera Manager 5.8.0
- What's Changed in Cloudera Manager 5.7.0
- What's Changed in Cloudera Manager 5.5.0
- What's Changed in Cloudera Manager 5.4.1
- What's Changed in Cloudera Manager 5.4.0
- What's Changed in Cloudera Manager 5.3.2
- What's Changed in Cloudera Manager 5.3.0
- What's Changed in Cloudera Manager 5.2.1
- What's Changed in Cloudera Manager 5.2.0
- What's Changed in Cloudera Manager 5.1.0
- What's Changed in Cloudera Manager 5.0.0
- What's Changed in Cloudera Manager 5.0.0 Beta 2
- What's Changed in Cloudera Manager 5.0.0 Beta 1
What's Changed in Cloudera Manager 5.16
- Backup and Disaster Recovery - BDR now ignores Kudu tables during replication. The change does not affect functionality since BDR does not support Kudu tables. This change was made to guard against data loss due to how the Hive MestaStore, Imapla, and Kudu interact.
What's Changed in Cloudera Manager 5.15
- Host Inspector - Previously, the Host Inspector displayed a list of components and version numbers. Now, all mismatched versions of a component are grouped together.
- Key Trustee Server - You cannot run the CDH Upgrade wizard from Key Trustee Server clusters. Previously, you could run the upgrade wizard on a Key Trustee
Server cluster even though it is not part of the CDH upgrade.
To upgrade Key Trustee Server, distribute and activate a new Key Trustee Server parcel. Then restart the Key Trustee Server service. OPSAPS-42792
- Key Trustee Server KMS Service - Previously, you always had to enter a key admin user when you add a Key Trustee Service. Now, as long as the key admin user or key admin group contain a non-empty value, the Generate ACLs button is enabled.
- Menu and UI Labels
Previously, Hive-on-S3 was a Replication Option when you create or edit a Hive replication schedule. The option has been renamed Hive-on-Cloud since Microsoft ADLS is now supported by Backup and Disaster Recovery
- Stale Services - Previously, stale services were not preselected in the Restart Staled Services wizard. They are now preselected.
What's Changed in Cloudera Manager 5.14
- API
- The enum field apiCluster.apiClusterVersion is deprecated. Use the string field apiCluster.fullVersion instead.
- You can now configure the timeout for TimeSeriesQueryService with the API. Previously, there was a hard-coded 20 second timeout.
- Authorization
- The Reports Manager role is no longer required to run Cloudera Management Service anymore.
- When Cloudera Manager users external authentication, such as LDAP, role assignment from the Users page is now disabled.
- BDR
- You no longer have to select the Skip checksum check option for BDR replication to function properly when one or both clusters use encryption. The checksum check is transparently and automatically skipped in this situation.
-
Removed an extra RPC call to source NameNode in the mapper phase of back up replications. The fix reduces the number of RPC calls to the NameNode to less than half. After this fix:
- Customers will see an improvement in replication (Backup and Disaster Recovery) performance.
- Customers will be better able to scale their replication jobs and the production cluster NameNode is no longer a bottleneck during the mapper phase.
- Installation
- CDH4 no longer appears as an install option in the Cloudera Manager Installation Wizard. CDH4 builds will also not be recognized if users try to use the REST API to install CDH4 on a new host.
- The list of available versions of CDH5 for package-based installation is now obtained by querying archive.cloudera.com. New maintenance versions of CDH will now be available for install without having to upgrade Cloudera Manager. If archive.cloudera.com is not reachable, the list of CDH versions available will be empty.
- Cloudera Manager now only installs Java 7. Previously when you installed Java on managed hosts using Cloudera Manager, both Java 6 and 7 were installed.
What's Changed in Cloudera Manager 5.13
- External Accounts
The AWS Credentials and Altus Credentials under Administration have been consolidated. You can find the pages for the credentials by selecting .
- RPC Wait Behavior
When starting or restarting HDFS, Cloudera Manager now waits for any NameNode daemons that are started to begin responding to RPC requests before considering the start/restart operation to be complete. Previously, this wait was only performed in certain workflows; now, it is always done when starting HDFS.
- Hue Load Balancer Enabled by Default
For better Hue web site performance, the Hue Load Balancer is now enabled by default. In a secure cluster, Hue Load Balancer (apache httpd 2.4) requires "use_x_forwarded_host" to be set to "true". This change will cause staleness.
What's Changed in Cloudera Manager 5.12
- Backup and Disaster Recovery
- Replicate Impala metadata feature for CDH 5.12 or later
Replication of “native” Impala UDFs (transient UDFs created with the old syntax prior to CDH 5.7) is no longer available for CDH clusters that run version 5.12 or later because Impala native UDFs are deprecated. If you still use Impala native UDFs and want them to be replicated, you should recreate them within the Hive Metastore using the new Create Function syntax supported since CDH 5.7.
- Dynamic Resource Pool Scheduling Rules
You can no longer specify a one-time scheduling rule for a dynamic resource pool. Recurring scheduling rules are still supported. To make one-time changed to resource pool configuration, simply update the pools via the Cloudera Manager UI or API.
- Replicate Impala metadata feature for CDH 5.12 or later
- Add Cluster Wizard
MapReduce1 has been removed as an option from the Add Cluster Wizard. It can still be added to a cluster after initial cluster creation or through the API.
- Web UI
- Icons
The icons in the Cloudera Manager web UI are more usable and distinct.
- Stack Traces
By default, stack traces are no longer displayed in the web UI. Please view the Cloudera Manager Server log under
to view stack traces of exceptions generated by Cloudera Manager.
- Icons
- Key Trustee Parcel
The Key Trustee Parcel will no longer be released via archive.cloudera.com. The parcel will now be released via http://www.cloudera.com/downloads. Parcels already released on the archive site will continue to be available there.
- Database names and DB usernames for the Cloudera Manager database.
You can now only use alphanumeric characters and underscores for database names and usernames with the scm_prepare_database.sh script to ensure properly supported database names and usernames.
- Cloudera Manager Agent Dependencies
The Cloudera Manager Agent has new package dependencies. The netstat and ifconfig commands have been replaced with the ss and ip commands respectively.
For more information, see iproute package requirement.
What's Changed in Cloudera Manager 5.11
- Change to default fencing method for HDFS High Availability
The default value of the HDFS High Availability Fencing Methods property (dfs.ha.fencing.methods) has been changed from shell(./cloudera_manager_agent_fencer.py to shell(true). While the old default value was reasonable for early versions of HDFS, it is no longer recommended for HAQJ-based versions of HDFS HA, as it can lead to HDFS service outages. The new default, shell(true), is the setting that Cloudera recommends. It uses the built-in HDFS fencing method, which causes the fenced NameNode to exit if it attempts a write operation when it is not supposed to be active. If auto-restart is enabled for the NameNode, Cloudera Manager will then restart it.
Most clusters already have the HDFS High Availability Fencing Methods explicitly set to shell(true) since that is the value set by the HDFS Enable High Availability Wizard. When such clusters are upgraded to Cloudera Manager 5.11, this explicit setting of shell(true) will be removed. This does not change the effective value of the property since it takes on the new default value, which is also shell(true). The cluster will experience no change in fencing behavior.
When clusters that have HDFS High Availability Fencing Methods set to the pre-5.11 default value are upgraded to Cloudera Manager 5.11, the effective value of the property changes to shell(true), and the fencing behavior changes accordingly. This also causes the HDFS service to become stale, requiring a restart of HDFS and dependent services. If HDFS High Availability Fencing Methods is set to a non-default value other than shell(true), no change occurs in the property's value when the cluster is upgraded to Cloudera Manager 5.11. Because the fencing method shell(./cloudera_manager_agent_fencer.py) can lead to service outages, a new configuration warning message will be displayed if it is in use after the upgrade to Cloudera Manager 5.11 is complete.
What's Changed in Cloudera Manager 5.10
- Configuration Changes
- Llama configuration options removed
Llama roles must be removed before upgrading to CDH 5.10.0 or higher. If your cluster has an Impala Llama role and you are using Cloudera Manager to upgrade, Cloudera Manager displays an error message and prevents the upgrade from going forward. You must first remove any existing Llama roles, using the Disable YARN and Impala Integrated Resource Management command (Before upgrading, go to the Impala Service and select .) Cloudera Manager also generates a configuration error if a Llama role is added to a CDH 5.10 cluster, and prevents Impala from starting until the role is removed.
- JAVA_HOME in agent configuration deprecated in 5.x
Specifying host agent environment variable CMF_AGENT_JAVA_HOME is deprecated and will not be supported in a future release. Instead , specify Java Home Directory property in Cloudera Manager.
- The hbase.client.scanner.timeout.period property is now configurable for clients
The Scan API-related property hbase.client.scanner.timeout.period in the HBase service is now configurable at the Client (Gateway) level. Its value may be set to be equal to or lesser than the RegionServer-side equivalent of HBase RegionServer Lease Period.
- Llama configuration options removed
- Warning if KeyTrustee Server is located on cluster machines
The KeyTrustee Server should be installed on dedicated hosts in a separate cluster. While this is adequate for proof of concept deployments, Cloudera does not recommend such installations for use in a production environment. Cloudera now shows a warning to the user when adding a Key Trustee server service onto an existing CDH cluster host.
What's Changed in Cloudera Manager 5.9.0
- You must be have the BDR, Cluster, or Full Administrator role to view the following pages:
- HDFS File Browser
- Directory Usage Report
- HBase Table Browser
- Solr Collections
- HBase Table Statistics
What's Changed in Cloudera Manager 5.8.0
- The YARN service's list of Allowed System Users now includes the hbase user by default. The reason for this change is that
several essential HBase tools such as the MOB Sweeper, Import/Export tools, and CopyTable, need to interact with HBase as the hbase user to be able to execute MapReduce
jobs.
Note that this change is only applicable to new Cloudera Manager deployments. Upgrading to Cloudera Manager 5.8 will not add the hbase user to the list of defaults.
What's Changed in Cloudera Manager 5.7.0
- The Navigator Metadata Server requires 192 MiB of Java PermGen space instead of 128 MiB. The value of this internal setting used by the JDK is increased automatically when upgrading to Cloudera Manager 5.7.
- The default value for hive.compute.query.using.stats is changed to false. The reason for the change is that certain queries such as count, max, and min return incorrect results with this optimization on.
- By default, Hive sessions now only consider sessions with no recent activity to be idle (hive.server2.idle.session.timeout_check_operation) and idle session timeouts have been reduced (hive.server2.idle.session.timeout and hive.server2.idle.operation.timeout). This helps reduce the strain on HiveServer2 from too many open sessions.
- Cloudera Manager no longer automatically refreshes scheduler configurations when dynamic resource pool settings are changed. You must explicitly refresh the configurations. This allows you to schedule the changes to minimize the impact on your cluster.
- For YARN, the default number of log directories (yarn.nodemanager.log-dirs) has changed from 1 to be equal to the number of mount points, to prevent applications with a large number of logs from filling up a single disk.
- The default for Java Heap Size of JournalNode in Bytes is now 512 MB.
- The Sources page for HDFS and Hive replications has been removed. A list of sources is available from a drop-down menu when you schedule a replication.
- The number of watched directories you can specify for the Disk Usage Report is now unlimited.
- Cloudera Manager now uses a new memory allocation algorithm to allocate memory when multiple roles are installed on the same host. See Memory.
- User sessions in the Cloudera Manager Admin Console now timeout after a configurable period of time of inactivity. A dialog box warns the user before automatically logging out the user.
- The All Recent Commands page now loads more quickly.
- The Disk Usage reports now have links that take users to the Directory Usage Report with the correct filter applied.
- When searching for hosts on the Hosts page, you can now filter the hosts list by entering search terms (hostname, IP address, or role) in the search box separated by commas or spaces. You can use quotes for exact matches (for example, strings that contain spaces, such as a role name) and brackets to search for ranges. Hosts that match any of the search terms are displayed.
- Isilon is now supported as a source or destination service for HDFS replications.
- For CDH 5.7 and higher if CDH_PYTHON is set by a Spark plug-in, PYSPARK_PYTHON is set to CDH_PYTHON in spark-env.sh. If you install a Python runtime parcel, such as the Anaconda parcel, Python Spark jobs run in both YARN client and YARN cluster modes are automatically configured by redeploying the Spark client configuration.
What's Changed in Cloudera Manager 5.5.0
- Removed -XX:-CMSConcurrentMTEnabled from the default JVM options. This setting makes the JVM run in single threaded mode. This was needed for Java 1.6_31
and lower but not for Java 1.6_32 or higher. Anybody using Java 1.6_31 or lower should upgrade to the latest recommended version of Java 1.7.
This change causes all roles to be stale after you upgrade to Cloudera Manager 5.5 and they are indicated as requiring restart in the Cloudera Manager Admin Console. However, as with any upgrade, this is a valid, functional Cloudera Manager state, and the cluster only needs to be restarted when you want the new configurations to take effect.
- HADOOP_USER_CLASSPATH_FIRST is now adhered to in Hadoop client configurations. After you upgrade Cloudera Manager, services display a client configuration redeployment required icon .
- For RHEL 7, the force_start, fast_*, clean_*, and hard_* commands on the server-scm-* services no longer work, as custom start, restart, and stop commands are not supported on systemd based distributions. These have been replaced with *_next_* operations, which do not trigger an immediate operation, but signal that the next invoked operation will be forced, fast, clean, or hard.
- The Cloudera EULA is now shown when using the Cloudera Manager Admin Console for the first time.
- The Home tab has been removed from the Cloudera Manager Admin Console navigation bar. You can return to the Home page Status tab by clicking the Cloudera Manager logo.
- All the icons have been refreshed to make them cleaner and easier to read.
- In 5.4.0 an externally assigned role was combined with a Cloudera Manager assigned role and the user had the union of the role privileges. As a consequence, an external user could be
assigned an administrator role in Cloudera Manager and they would be an administrator regardless of the externally assigned role. Now only the externally assigned roles are respected. No roles can be
assigned to an external user in Cloudera Manager and any roles for an external user in the Cloudera Manager are ignored.
As a result of this change, external users with previously-assigned Cloudera Manager roles will have their permissions modified depending on the LDAP group they belong to. To restore permissions for external users, configure the LDAP groups for these users by navigating to
, and click to display the relevant properties. - Cloudera Manager and CDH components support TLS 1.0, TLS 1.1, and TLS 1.2, but not SSL 3.0. SSL remains part of the TLS/SSL name for historical reasons. For the complete list of supported versions, see CDH and Cloudera Manager Supported Transport Layer Security Versions.
- The label on the Generate Credentials button has been changed to Generate Missing Credentials to better reflect the fact that it only creates Kerberos principals that are not present yet in Cloudera Manager.
- Cloudera Manager now downloads binaries from https://archive.cloudera.com instead of https://archive.cloudera.com.
- The embedded hbck feature has been removed from HBase monitoring for stability reasons.
- Increased the default heap sizes for Hive roles. On clusters with sufficient memory, newly created Hive roles have these values:
- HiveServer2 - 4 G heap, 512 M perm gen
- Hive Metastore - 8 G heap, 512 M perm gen
- Gateway - 2 G heap, 512 M perm gen
- By default, Oozie now purges eligible completed workflows and coordinator actions for long-running coordinator jobs.
- Oozie actions that omit the <job-tracker> and <name-node> elements (and the workflow does not define them in the <global> section) use the default values for the JobTracker, Resource Manager, and NameNode from Cloudera Manager in CDH 5.5 and higher.
- Increased the defaults for Oozie parameters:
- oozie.service.CallableQueueService.callable.concurrency - 10
- oozie.service.CallableQueueService.threads - 50
- Sqoop 2 is no longer in the default services to be created in any of the options in the installation wizard. You can choose to add it to the Custom Services option in the Installation wizard or can add it with the Add Service wizard after installation.
- For CDH 5.5.0 and higher the default values of the YARN properties mapreduce.[map|reduce].java.opts.max.heap and mapreduce.[map|reduce].memory.mb have been changed to 0, which tells YARN to automatically select a default. This helps avoid issues where either heap or memory.mb is updated, but not the other one (memory.mb should be ~30% higher than heap to allow for JVM overhead).
- The Host DNS Resolution Duration health test was removed. Its functionality is now covered in the Host DNS Resolution health test.
- The default Replication Strategy is now Dynamic.
What's Changed in Cloudera Manager 5.4.1
HDFS Read Throughput Impala query monitoring property is misleading
The hbase_bytes_read_per_second and hdfs_bytes_read_per_second Impala query properties have been renamed to hbase_scanner_average_bytes_read_per_second and hdfs_scanner_average_bytes_read_per_second to more accurately reflect that these properties return the average throughput of the query's HBase and HDFS scanner threads respectively. The previous names and descriptions gave the impression that these properties were the query's total HBase and HDFS throughput, which was not accurate.
What's Changed in Cloudera Manager 5.4.0
- Cloudera Manager checks the specified version of CDH before an installation and upgrade to ensure that it is compatible with Cloudera Manager before proceeding. Specifically, for Cloudera Manager 5.4 that means no version of CDH newer than 5.4.x is supported (Cloudera Manager must be upgraded before upgrading to such a version of CDH). Cloudera Manager no longer shows these "too-new" versions of CDH. The 'latest' parcel repository URL will be replaced by the 'latest_supported' repository in the parcel configuration.
- The minimum Java heap size for the Activity Monitor, Host Monitor, and Service Monitor has been changed from 50 MB to 256 MB.
- Regenerating Kerberos principals will be denied if any roles that are using those principals are running. Stop those roles and then attempt to regenerate the principals.
- In previous versions of Cloudera Manager, the 'version' attribute in tsquery had values that were integers, for example, 4 for CDH4, 5 for CDH5, -1 for Cloudera Manager. Starting in the Cloudera Manager 5.4, the values for the 'version' attribute are in release string format, for example "cdh5.0.0".
- Hive
- hive.exec.reducers.max default value changed from 999 to 1099
- hive.exec.reducers.bytes.per.reducer default value changed from 1 GB to 64 MB
- The default heap size for the Hive CLI is increased to 1 GB.
- The property hive.log.explain.output is known to create instability of Cloudera Manager Agents in some specific circumstances, specially when the hive queries generate extremely large EXPLAIN outputs. Therefore, the property has been hidden from the Cloudera Manager configuration UI. The property can still be configured through the use of advanced configuration snippets.
- Impala - The Impala Daemon now supports the Impala Maximum Log Files property which specifies the total number of log files per severity level that should be retained before they are deleted. By default, after upgrading to CDH 5.4 this property is set to 10, which means that Impala Daemons will only retain up to 10 log files for each severity level. Any additional files will be deleted.
- HBase - Moved three settings for HBase coprocessors from Main to Advanced category:
- Service Wide > HBase Coprocessor Abort on Error: move to 'Service Wide > Advanced > HBase Coprocessor Abort on Error'
- 'Master Default Group > HBase Coprocessor Master Classes': move to 'Master Default Group > Advanced > HBase Coprocessor Master Classes'
- RegionServer Default Group > HBase Coprocessor Region Classes': move to 'RegionServer Default Group > Advanced > HBase Coprocessor Region Classes'
What's Changed in Cloudera Manager 5.3.2
- Turning on the internal HBase canary (not to be confused with Cloudera Manager monitoring canary) is optional. On new clusters, it will not be enabled by default. Existing clusters will continue to run the canary until it is disabled from the HBase configuration page.
What's Changed in Cloudera Manager 5.3.0
- Cloudera Manager upgrade - If you have any active commands running before upgrade, the server will fail to start after upgrade. This includes commands a user might have run and also for commands Cloudera Manager automatically triggers, either in response to a state change, or something that's on a schedule.
What's Changed in Cloudera Manager 5.2.1
- The default value of the YARN yarn.nodemanager.recovery.dir property has changed from {hadoop.tmp.dir}/yarn-nm-recovery to /var/lib/hadoop-yarn/yarn-nm-recovery.
What's Changed in Cloudera Manager 5.2.0
- Rolling upgrade - As a result of a recent change in the way DataNodes handle block deletions during a rolling upgrade (HDFS-5907), the Trash directory may grow unexpectedly while the upgrade is in progress. Deleted blocks are kept during upgrade in case you want to roll back. The blocks are cleaned up after you finalize the upgrade.
- Agent -
- The hard_stop, hard_restart, and clean_restart commands now show a warning message about the impact of using these commands instead of performing the actions. To actually perform the actions, you use the hard_stop_confirmed, hard_restart_confirmed, and clean_restart_confirmed commands.
- The default supervisord port is changed from 9001 to 19001
- YARN application attributes renamed: slot_millis to slots_millis and fallow_slot_millis to fallow_slots_millis
What's Changed in Cloudera Manager 5.1.0
- UI refresh for scalability
- Revised authorization privilege model in Sentry. See Privilege Model.
What's Changed in Cloudera Manager 5.0.0
- MapReduce now inherits topology from HDFS NameNode. Topology configuration for MapReduce JobTracker was removed. The configuration was redundant and the two parameters should always have been set to the same value.
- UI
- The Clusters tab no longer has Activities, Other, and Manage Resources sections.
What's Changed in Cloudera Manager 5.0.0 Beta 2
- Product
- Cloudera Backup and Disaster Recovery (BDR) is now included with Cloudera Enterprise.
- Cloudera Standard has been renamed to Cloudera Express.
- OS and packaging
- The name of the Cloudera Manager embedded database package has changed from cloudera-manager-server-db to cloudera-manager-server-db-2. For details, read the upgrade and install topics for your OS.
- Support for Ubuntu 10.04 and Debian 6.0 is deprecated.
- HDFS - enabling High Availability automatically enables auto-failover, unlike in Cloudera Manager 4 where enable auto-failover was a separate command.
- HBase
- In CDH 5 there is no HBase canary because HBase is now monitored by a watchdog process. In CDH 4, the HBase canary is still used.
- The RegionServer default heap size has been increased to 4GB.
- Monitoring
- Chart "Views" and actions related to views have been renamed to "Dashboard".
- Changes to how attribute filters are displayed in the Impala queries and YARN applications screens
- The outdated configuration indicator on the Home, service, and role pages has a new graphic and now has a tooltip that displays whether a cluster refresh or restart is required. There is a new indicator for changes that require redeploying client configurations. You can click an indicator to go to the new Stale Configurations page to view and resolve the conditions that gave rise to the indicator.
- To match the naming convention of tsquery metrics, multiword Impala query and YARN application attribute names have changed from camel case to using an underscore separator. For example queryType has changed to query_type. For backward compatibility, camel case names are still supported.
- UI
- The main navigation bar in Cloudera Manager Admin Console has been reorganized. The Services tab has been replaced by a Clusters tab that contains links to individual services, which were previously under the Services tab, Activities and Reports sections, which were removed from the main bar, and a new Manage Resources section, which contains links to the new resource pools and service pools features. The All Services page has been removed.
- The "Safety Valve" properties have been renamed "Advanced Configuration Snippet".
- The screen for specifying assignment of roles to hosts has been redesigned for improved scalability and usability.
- Misc
- The io.compression.codecs property has moved from MapReduce to HDFS.
What's Changed in Cloudera Manager 5.0.0 Beta 1
- When CDH 5 is installed, YARN is installed by default, rather than MapReduce, and is the default execution environment. MapReduce is deprecated in CDH 5 but is fully supported for backward compatibility through CDH 5. In CDH 4, MapReduce is still the default.
- The setting for yarn.scheduler.maximum-allocation-mb has been increased to a default of 64GB.
- The minimum heap size for the Solr service has been increased to 200MB (from 50MB previously) to enable it to better handle collection creation.