Known Issues and Workarounds in Cloudera Manager 5

NameNode Incorrectly Reports Missing Blocks During Rolling Upgrade

During a rolling upgrade to any of the CDH releases listed below, the NameNode may report missing blocks after rolling multiple DataNodes. This is caused by a race condition with block reporting between the DataNode and the NameNode. No permanent data loss occurs, but data can be unavailable for up to six hours before the problem corrects itself.

Releases affected: CDH 5.0.6, 5.1.5, 5.2.5, 5.3.3, 5.4.1, 5.4.2.

Releases containing the fix: CDH 5.2.6, 5.3.4, 5.4.3.

Workaround:

To avoid the problem - Cloudera advises skipping the affected releases and installing a release containing the fix. For example, do not upgrade to CDH 5.4.2; upgrade to CDH 5.4.3 instead.
If you have already completed an upgrade to an affected release, or are installing a new cluster - You can continue to run the release, or upgrade to a release that is not affected, as you choose. If you choose to upgrade to a release that is not affected, you must first upgrade to a version of Cloudera Manager—5.2.x, 5.3.x, or 5.4.x.—supporting the respective fixed CDH releases.

Using ext3 for server dirs easily hit inode limit

Using the ext3 filesystem for the Cloudera Manager command storage directory may exceed the maximum subdirectory size of 32000.

Workaround: Either decrease the value of the Command Eviction Age property so that the directories are more aggressively cleaned up, or migrate to the ext4 filesystem.

Changes to property for yarn.nodemanager.remote-app-log-dir are not included in the JobHistory Server yarn-site.xml file

When "Remote App Log Directory" is changed in YARN configuration, the property yarn.nodemanager.remote-app-log-dir is not included in the JobHistory Server's yarn-site.xml file.

Workaround: Set JobHistory Server Advanced Configuration Snippet (Safety Valve) for yarn-site.xml to:

<property>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/path/to/logs</value>
</property>

Host recommissioning and decommissioning should occur independently

In large clusters, when problems appear with a host or role, administrators may choose to decommission the host or role to fix it and then recommission the host or role to put it back in production. Decommissioning, especially host decommissioning, is slow, hence the importance of parallelization, so that host recommissioning can be initiated before decommissioning is done.

Workaround: None.

search_bind_authentication for Hue is not included in .ini file

When search_bind_authentication is set to false, CM does not include it in hue.ini.

Workaround: Add the following to the Hue Service Advanced Configuration Snippet (Safety Valve) for hue_safety_valve.ini:

[desktop]
[[ldap]]
search_bind_authentication=false

Hive CLI does not work in CDH 4 when "Bypass Hive Metastore Server" is enabled

Hive CLI does not work in CDH 4 when "Bypass Hive Metastore Server" is enabled.

Workaround: Configure Hive and disable the "Bypass Hive Metastore Server" option.

Alternatively, an approach can be taken that will cause the "Hive Auxiliary JARs Directory" to not work, but will enable basic Hive commands to work. Add the following to "Gateway Client Environment Advanced Configuration Snippet for hive-env.sh (Safety Valve)," then re-deploy the Hive client configuration:

HIVE_AUX_JARS_PATH=""
AUX_CLASSPATH=/usr/share/java/mysql-connector-java.jar:/usr/share/java/oracle-connector-java.jar:$(find /usr/share/cmf/lib/postgresql-jdbc.jar 2> /dev/null | tail -n 1)

The mapreduce.jobhistory.admin.address is not set in the gateway configs

Clients of the JobHistory Server Admin Interface Require Advanced Configuration Snippet Clients of the JobHistory server's administrative interface, such as the "mapred hsadmin" tool, may fail to connect to the server when run on hosts other than the one where the JobHistory server is running.

Workaround: Add the following to both the MapReduce Client Advanced Configuration Snippet for mapred-site.xml and the Cluster-wide Advanced Configuration Snippet for core-site.xml, replacing <JOBHISTORY_SERVER_HOST> with the host name of your JobHistory server:

<property>
<name>mapreduce.history.admin.address</name>
<value><JOBHISTORY_SERVER_HOST>:10033</value>
</property>

Created pools are not preserved when Dynamic Resource Pools page is used to configure YARN or Impala

Pools created on demand are not preserved when changes are made using the Dynamic Resource Pools page. If the Dynamic Resource Pools page is used to configure YARN and/or Impala services in a cluster, it is possible to specify pool placement rules that create a pool if one does not already exist. If changes are made to the configuration using this page, pools created as a result of such rules are not preserved across the configuration change.

Workaround: Submit the YARN application or Impala query as before, and the pool will be created on demand once again.

User should be prompted to add the AMON role when adding MapReduce to a CDH 5 cluster

When the MapReduce service is added to a CDH 5 cluster, the user is not asked to add the AMON role. Then, an error displays when the user tries to view MapReduce activities.

Workaround: Manually add the AMON role after adding the MapReduce service.

Enterprise license expiration alert not displayed until Cloudera Manager Server is restarted

When an enterprise license expires, the expiration notification banner is not displayed until the Cloudera Manager Server has been restarted (although the expired enterprise features stop working immediately upon license expiration, as expected).

Workaround: None.

Erroneous warning displayed on the HBase configuration page on CDH 4.1 in Cloudera Manager 5.0.0

An erroneous "Failed parameter validation" warning is displayed on the HBase configuration page on CDH 4.1 in Cloudera Manager 5.0.0

Severity: Low

Workaround: Use CDH4.2 or higher, or ignore the warning.

Cluster installation with CDH 4.1 and Impala fails

In Cloudera Manager 5.0, installing a new cluster through the wizard with CDH 4.1 and Impala fails with the following error message, "dfs.client.use.legacy.blockreader.local is not enabled."

Workaround: Perform one of the following:

Use CDH 4.2 or higher, or
Install all desired services except Impala in your initial cluster setup. From the home page, use the dropdown menu near the cluster name and select Configure CDH Version. Confirm the version, then add Impala.

HDFS NFS gateway works only on RHEL and similar systems

Because of bug in native versions of portmap/rpcbind, the HDFS NFS gateway does not work on SLES, Ubuntu, or Debian systems. It does work on supported versions of RHEL- compatible systems on which rpcbind-0.2.0-10.el6 or later is installed. (See

Bug: 731542 (Red Hat), 823364 (SLES), 594880 (Debian)

Severity: High

Workaround:

On Red Hat and similar systems, make sure rpcbind-0.2.0-10.el6 or later is installed.
On SLES, Debian, and Ubuntu systems, you can use the gateway by running rpcbind in insecure mode, using the -i option, but keep in mind that this allows anyone from a remote host to bind to the portmap.

The Spark Upload Jar command fails in a secure cluster

The Spark Upload Jar command fails in a secure cluster.

Workaround: To run Spark on YARN, manually upload the Spark assembly jar to HDFS /user/spark/share/lib. The Spark assembly jar is located on the local filesystem, typically in /usr/lib/spark/assembly/lib or /opt/cloudera/parcels/CDH/lib/spark/assembly/lib.

Configurations for decommissioned roles not migrated from MapReduce to YARN

When the Import MapReduce Configuration wizard is used to import MapReduce configurations to YARN, decommissioned roles in the MapReduce service do not cause the corresponding imported roles to be marked as decommissioned in YARN.

Workaround: Delete or decommission the roles in YARN after running the import.

The HDFS command Roll Edits does not work in the UI when HDFS is federated

The HDFS command Roll Edits does not work in the Cloudera Manager UI when HDFS is federated because the command doesn't know which nameservice to use.

Workaround: Use the API, not the Cloudera Manager UI, to execute the Roll Edits command.

Cloudera Manager reports a confusing version number if you have oozie-client, but not oozie installed on a CDH 4.4 node

In CDH versions before 4.4, the metadata identifying Oozie was placed in the client, rather than the server package. Consequently, if the client package is not installed, but the server is, Cloudera Manager will report Oozie has been present but as coming from CDH 3 instead of CDH 4.

Workaround: Either install the oozie-client package, or upgrade to at least CDH 4.4. Parcel based installations are unaffected.

The command history has an option to select the number of commands, but doesn't always return the number you request

Workaround: None.

Hue doesn't support YARN ResourceManager High Availability

Workaround: Configure the Hue Server to point to the active ResourceManager:

Go to the Hue service.
Select Configuration > View and Edit.
ClickHue Server Default Group > Advanced.

In the Hue Server Advanced Configuration Snippet (Safety Valve) for hue_safety_valve_server.ini field, add the following:

[hadoop]
[[ yarn_clusters ]]
[[[default]]]
resourcemanager_host=<hostname of active ResourceManager>
resourcemanager_api_url=http://<hostname of active resource manager>:<web port of active resource manager>
proxy_api_url=http://<hostname of active resource manager>:<web port of active resource manager>

The default web port of Resource Manager is 8088.

Click Save Changes to have these configurations take effect.
Restart the Hue service.

Cloudera Manager doesn't work with CDH 5.0.0 Beta 1

When you upgrade from Cloudera Manager 5.0.0 Beta 1 with CDH 5.0.0 Beta 1 to Cloudera Manager 5.0.0 Beta 2, Cloudera Manager won't work with CDH 5.0.0 Beta 1 and there's no notification of that fact.

Workaround: None. Do a new installation of CDH 5.0.0 Beta 2.

On CDH 4.1 secure clusters managed by Cloudera Manager 4.8.1 and higher, the Impala Catalog server needs advanced configuration snippet update

Impala queries fail on CDH 4.1 when Hive "Bypass Hive Metastore Server" option is selected.

Workaround: Add the following to Impala catalog server advanced configuration snippet for hive-site.xml, replacing <Hive_Metastore_Server_Host> with the host name of your Hive Metastore Server:

<property>
<name>hive.metastore.local</name>
<value>false</value>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://<Hive_Metastore_Server_Host>:9083</value>
</property>

Rolling Upgrade to CDH 5 is not supported.

Rolling upgrade between CDH 4 and CDH 5 is not supported. Incompatibilities between major versions means rolling restarts are not possible. In addition, rolling upgrade will not be supported from CDH 5.0.0 Beta 1 to any later releases, and may not be supported between any future beta versions of CDH 5 and the General Availability release of CDH 5.

Workaround: None.

Error reading .zip file created with the Collect Diagnostic Data command.

After collecting Diagnostic Data and using the Download Diagnostic Data button to download the created zip file to the local system, the zip file cannot be opened using the FireFox browser on a Macintosh. This is because the zip file is created as a Zip64 file, and the unzip utility included with Macs does not support Zip64. The zip utility must be version 6.0 or later. You can determine the zip version with unzip -v.

Workaround: Update the unzip utility to a version that supports Zip64.

Enabling wildcarding in a secure environment causes NameNode to fail to start.

In a secure cluster, you cannot use a wildcard for the NameNode's RPC or HTTP bind address, or the NameNode will fail to start. For example, dfs.namenode.http-address must be a real, routable address and port, not 0.0.0.0.<port>. In Cloudera Manager, the "Bind NameNode to Wildcard Address" property must not be enabled. This should affect you only if you are running a secure cluster and your NameNode needs to bind to multiple local addresses.

Bug: HDFS-4448

Severity: Medium

Workaround: Disable the "Bind NameNode to Wildcard Address" property found under the Configuration tab for the NameNode role group.

After JobTracker failover, complete jobs from the previous active JobTracker are not visible.

When a JobTracker failover occurs and a new JobTracker becomes active, the new JobTracker UI does not show the completed jobs from the previously active JobTracker (that is now the standby JobTracker). For these jobs the "Job Details" link does not work.

Severity: Med

Workaround: None.

After JobTracker failover, information about rerun jobs is not updated in Activity Monitor.

When a JobTracker failover occurs while there are running jobs, jobs are restarted by the new active JobTracker by default. For the restarted jobs the Activity Monitor will not update the following: 1) The start time of the restarted job will remain the start time of the original job. 2) Any Map or Reduce task that had finished before the failure happened will not be updated with information about the corresponding task that was rerun by the new active JobTracker.

Severity: Med

Workaround: None.

Installing on AWS, you must use private EC2 hostnames.

When installing on an AWS instance, and adding hosts using their public names, the installation will fail when the hosts fail to heartbeat.

Severity: Med

Workaround:

Use the Back button in the wizard to return to the original screen, where it prompts for a license.

Rerun the wizard, but choose "Use existing hosts" instead of searching for hosts. Now those hosts show up with their internal EC2 names.

Continue through the wizard and the installation should succeed.

After removing and then re-adding a service, the alternatives settings are incorrect.

After deleting a service, the alternatives settings are not cleaned up. If you then re-add the service, it will be given a new instance name, and a new set of configurations settings are added. However, because both the new and old (deleted) instances have the same alternatives priority, the original one will be used rather than the newer one.

Severity: Med

Workaround: The simplest way to fix this is:

Go to the Configuration tab for the new service instance in Cloudera Manager
Search for "alternatives".
Raise the priority value and save the setting.
Redeploy the client configuration.

Cloudera Manager does not support encrypted shuffle.

Encrypted shuffle has been introduced in CDH 4.1, but it is not currently possible to enable it through Cloudera Manager.

Severity: Medium

Workaround: None.

If HDFS uses Quorum-based Storage without HA enabled, the SecondaryNameNode cannot checkpoint.

If HDFS is set up in non-HA mode, but with Quorum-based storage configured, the dfs.namenode.edits.dir is automatically configured to the Quorum-based Storage URI. However, the SecondaryNameNode cannot currently read the edits from a Quorum-based Storage URI, and will be unable to do a checkpoint.

Severity: Medium

Workaround: Add to the NameNode's advanced configuration snippet the dfs.namenode.edits.dir property with both the value of the Quorum-based Storage URI as well as a local directory, and restart the NameNode. For example,

<property> <name>dfs.namenode.edits.dir</name>
<value>qjournal://jn1HostName:8485;jn2HostName:8485;jn3HostName:8485/journalhdfs1,file:///dfs/edits</value>
</property>

Changing the rack configuration may temporarily cause mis-replicated blocks to be reported.

A rack re-configuration will cause HDFS to report mis-replicated blocks until HDFS rebalances the system, which may take some time. This is a normal side-effect of changing the configuration.

Severity: Low

Workaround: None

Starting HDFS with HA and Automatic Failover enabled, one of the NameNodes might not start.

When starting an HDFS service with High Availability and Automatic Failover enabled, one of the NameNodes might might not start up.

Severity: Low

Workaround: To fix this, start the NameNode that failed to start up after the remaining HDFS roles start up.

Cannot use '/' as a mount point with a Federated HDFS Nameservice.

A Federated HDFS Service doesn't support nested mount points, so it is impossible to mount anything at '/'. Because of this issue, the root directory will always be read-only, and any client application that requires a writeable root directory will fail.

Severity: Low

Workaround:

In the CDH 4 HDFS Service > Configuration tab of the Cloudera Manager Admin Console, search for "nameservice".
In the Mountpoints field, change the mount point from "/" to a list of mount points that are in the namespace that the Nameservice will manage. (You can enter this as a comma-separated list - for example, "/hbase, /tmp, /user" or by clicking the plus icon to add each mount point in its own field.) You can determine the list of mount points by running the command hadoop fs -ls / from the CLI on the NameNode host.

Historical disk usage reports do not work with federated HDFS.

Severity: Low

Workaround: None.

(CDH 4 only) Activity monitoring does not work on YARN activities.

Activity monitoring is not supported for YARN in CDH 4.

Severity: Low

Workaround: None

HDFS monitoring configuration applies to all Nameservices

The monitoring configurations at the HDFS level apply to all Nameservices. So, if there are two federated Nameservices, it's not possible to disable a check on one but not the other. Likewise, it's not possible to have different thresholds for the two Nameservices.

Severity: Low

Workaround: None

Supported and Unsupported Replication Scenarios and Limitations

See Supported Replication Scenarios

Restoring snapshot of a file to an empty directory does not overwrite the directory

Restoring the snapshot of an HDFS file to an HDFS path that is an empty HDFS directory (using the Restore As action) will result in the restored file present inside the HDFS directory instead of overwriting the empty HDFS directory.

Workaround: None.

HDFS Snapshot appears to fail if policy specifies duplicate directories.

In an HDFS snapshot policy, if a directory is specified more than once, the snapshot appears to fail with an error message on the Snapshot page. However, in the HDFS Browser, the snapshot is shown as having been created successfully.

Severity: Low

Workaround: Remove the duplicate directory specification from the policy.

Hive replication fails if "Force Overwrite" is not set.

The Force Overwrite option, if checked, forces overwriting data in the target metastore if there are incompatible changes detected. For example, if the target metastore was modified and a new partition was added to a table, this option would force deletion of that partition, overwriting the table with the version found on the source. If the Force Overwrite option is not set, recurring replications may fail.

Severity: Med

Workaround: Set the Force Overwrite option.

Known Issues and Workarounds in Cloudera Manager 5

NameNode Incorrectly Reports Missing Blocks During Rolling Upgrade

Using ext3 for server dirs easily hit inode limit

Changes to property for yarn.nodemanager.remote-app-log-dir are not included in the JobHistory Server yarn-site.xml file

Host recommissioning and decommissioning should occur independently

search_bind_authentication for Hue is not included in .ini file

Hive CLI does not work in CDH 4 when "Bypass Hive Metastore Server" is enabled

The mapreduce.jobhistory.admin.address is not set in the gateway configs

Created pools are not preserved when Dynamic Resource Pools page is used to configure YARN or Impala

User should be prompted to add the AMON role when adding MapReduce to a CDH 5 cluster

Enterprise license expiration alert not displayed until Cloudera Manager Server is restarted

Erroneous warning displayed on the HBase configuration page on CDH 4.1 in Cloudera Manager 5.0.0

Cluster installation with CDH 4.1 and Impala fails

HDFS NFS gateway works only on RHEL and similar systems

The Spark Upload Jar command fails in a secure cluster

Configurations for decommissioned roles not migrated from MapReduce to YARN

The HDFS command Roll Edits does not work in the UI when HDFS is federated

Cloudera Manager reports a confusing version number if you have oozie-client, but not oozie installed on a CDH 4.4 node

The command history has an option to select the number of commands, but doesn't always return the number you request

Hue doesn't support YARN ResourceManager High Availability

Cloudera Manager doesn't work with CDH 5.0.0 Beta 1

On CDH 4.1 secure clusters managed by Cloudera Manager 4.8.1 and higher, the Impala Catalog server needs advanced configuration snippet update

Rolling Upgrade to CDH 5 is not supported.

Error reading .zip file created with the Collect Diagnostic Data command.

Enabling wildcarding in a secure environment causes NameNode to fail to start.

After JobTracker failover, complete jobs from the previous active JobTracker are not visible.

After JobTracker failover, information about rerun jobs is not updated in Activity Monitor.

Installing on AWS, you must use private EC2 hostnames.

After removing and then re-adding a service, the alternatives settings are incorrect.

Cloudera Manager does not support encrypted shuffle.

If HDFS uses Quorum-based Storage without HA enabled, the SecondaryNameNode cannot checkpoint.

Changing the rack configuration may temporarily cause mis-replicated blocks to be reported.

Starting HDFS with HA and Automatic Failover enabled, one of the NameNodes might not start.

Cannot use '/' as a mount point with a Federated HDFS Nameservice.

Historical disk usage reports do not work with federated HDFS.

(CDH 4 only) Activity monitoring does not work on YARN activities.

HDFS monitoring configuration applies to all Nameservices

Supported and Unsupported Replication Scenarios and Limitations

Restoring snapshot of a file to an empty directory does not overwrite the directory

HDFS Snapshot appears to fail if policy specifies duplicate directories.

Hive replication fails if "Force Overwrite" is not set.

During HDFS replication, tasks may fail due to DataNode timeouts.