Issues Fixed in CDH 5 Beta Releases

The following topics describe issues fixed in CDH 5 Beta 2, after being discovered in CDH 5 Beta 1. You can also review What's New In CDH 5 Beta Releases or Known Issues in CDH 5.

Issues Fixed in CDH 5 Beta 2

CDH 5 Beta 2 fixes the following issues, organized by component.

Apache Hadoop

MapReduce

ResourceManager High Availability does not work on secure clusters

If JobTrackers in an High Availability configuration are shut down, migrated to new hosts, then restarted, no JobTracker becomes active. The logs show a Mismatched address exception.

Bug: None

Workaround: None.

Default port conflicts

By default, the Shuffle Handler (which runs inside the YARN NodeManager), the REST server, and many third-party applications, all use port 8080. This will result in conflicts if you deploy more than one of them without reconfiguring the default port.

Bug: None

Workaround: Make sure at most one service uses port 8080. To reconfigure the REST server, follow these instructions. To change the default port for the Shuffle Handler, set the value of mapreduce.shuffle.port in mapred-site.xml to an unused port.

JobTracker memory leak

The JobTracker has a memory leak caused by subtleties in the way UserGroupInformation interacts with the file-system cache. The number of cached file system objects can grow without bound.

Bug: MAPREDUCE-5508

Workaround: Set keep.failed.task.files to true, which will sidestep the memory leak but require job staging directories to be cleaned out manually.

Hue

Running a Hive Beeswax metastore on the same host as the Hue server will result in Simple Authentication and Security Layer (SASL) authentication failures on a Kerberos-enabled cluster

Bug: None

Workaround: The simple workaround is to run the metastore server remotely on a different host and make sure all Hive and Hue configurations properly refer to it. A more complex workaround is to adjust network configurations to ensure that reverse DNS properly resolves the host's address to its fully qualified-domain name (FQDN) rather than localhost.

The Pig shell does not work when NameNode uses a wildcard address

The Pig shell does not work from Hue if you use a wildcard for the NameNode's RPC or HTTP bind address. For example, dfs.namenode.http-address must be a real, routable address and port, not 0.0.0.0.<port>.

Bug: HUE-1060

Workaround: Use a real, routable address and port, not 0.0.0.0.<port>, for the NameNode; or use the Pig application directly, rather than from Hue.

Apache Sqoop

Oozie and Sqoop 2 may need additional configuration to work with YARN

In CDH 5, MRv2 (YARN) MapReduce 2.0 is recommended over the Hadoop 0.20-based MRv1. The default configuration may not reflect this in Oozie and Sqoop 2 in CDH 5 Beta 2, however, unless you are using Cloudera Manager.

Bug: None

Workaround: Check the value of CATALINA_BASE in /etc/oozie/conf/oozie-env.sh (if you are running an Oozie server) and /etc/default/sqoop2-server (if you are using a Sqoop 2 server). You should also ensure that CATALINA_BASE is correctly set in your environment if you are invoking /usr/bin/sqoop2-server directly instead of using the service init scripts. For Oozie, CATALINA_BASE should be set to /usr/lib/oozie/oozie-server for YARN, or /usr/lib/oozie/oozie-server-0.20 for MRv1. For Sqoop 2, CATALINA_BASE should be set to /usr/lib/sqoop2/sqoop-server for YARN, or /usr/lib/sqoop2/sqoop-server-0.20 on MRv1.

Apache Sentry (incubating)

Sentry allows unauthorized access to a directory whose name includes the scratch directory name as a prefix

As an example, if the scratch directory path is /tmp/hive, and you create a directory /tmp/hive-data, Sentry allows unauthorized read/write access to /tmp/hive-data.

Bug: None

Workaround: For external tables or data export location, do not use a pathname that includes the scratch directory name as a prefix. For example, if the scratch directory is /tmp/hive, do not locate external tables or exported data in /tmp/hive-data or any directory whose path uses "/tmp/hive-" as a prefix.

Apache Oozie

Oozie Hive action against HiveServer2 fails on a secure cluster

Workaround: None