What's New in CDH 5.0.0
The following topics describe new features introduced in CDH 5.0.0.
Apache Hadoop
HDFS
- HDFS-5339 - WebHDFS URI does not accept logical nameservices when security is enabled.
- HDFS-5898 - Allow NFS gateway to login/relogin from its Kerberos keytab.
- HDFS-5921 - "Browse filesystem" on the Namenode UI doesn't work if any directory has the sticky bit set.
- HDFS and Hive replication between different Kerberos realms now works.
- HDFS-5922 - DataNode heartbeat thread can get stuck in a tight loop.
MapReduce & YARN
- FairScheduler supports moving running applications between queries.
- Several critical fixes to stabilize ResourceManager HA - Web UI, unmanaged ApplicationMasters and secure-cluster support.
- Support for large values of mapreduce.task.io.sort.mb.
- JobHistory Server has information on failed MapReduce jobs.
Apache HBase
- HBASE-10436- Restore RegionServer lists removed from HBase
0.96.0 JMX.
Many of the metrics exposed in CDH 4/0.94 were removed with the refactorization of metrics in CDH 5/0.96. This patch restores the availability of the lists of live and dead RegionServers. In 0.94 this was a large nested structure as shown below, which included the RegionServer lists and metrics from each region.
{ "name" : "hadoop:service=Master,name=Master", "modelerType" : "org.apache.hadoop.hbase.master.MXBeanImpl", "ZookeeperQuorum" : "localhost:2181", .... "RegionsInTransition" : [ ], "RegionServers" : [ { "key" : "localhost,48346,1390857257246", "value" : { "load" : 2, ....
CDH 5 Beta 1 and Beta 2 did not contain this list; they only displayed counts of the number of live and dead RegionServers. As of CDH 5.0.0, this list is now presented in a semi-colon separated field as follows:
{ "name" : "Hadoop:service=HBase,name=Master,sub=Server", "modelerType" : "Master,sub=Server", "tag.Context" : "master", "tag.liveRegionServers" : "localhost,56196,1391992019130", "tag.deadRegionServers" : "localhost,40010,1391035309673;localhost,41408,1391990380724;localhost,38682,1390950017735", ... }
- Assorted usability and compatibility improvements as well as improvements to exporting snapshots.
Apache Flume
- The HBase Sink now supports coalescing multiple Increment RPCs into one (FLUME-2338).
- File Channel Write timeout has been removed and the configuration parameter is now ignored (FLUME-2307).
- Syslog UDP source can now accept larger messages (FLUME-2130).
- AsyncHBase Sink is now fully functional (FLUME-2334).
- Use standard lookup to find queue/topic in JMS Source (FLUME-2311).
- Deadlock fixed in Dataset sink (FLUME-2320).
- FileChannel Dual Checkpoint Backup Thread is now released on application stop (FLUME-2328).
- Spool Dir source now checks interrupt flag before writing to channel (FLUME-2283).
- Morphline sink increments eventDrainAttemptCount when it takes event from channel (FLUME-2323).
- Bucketwriter now permanently closed only on idle and roll timeouts (FLUME-2325).
- BucketWriter#close now cancels idleFuture (FLUME-2305).
Apache Oozie
<fs name="archive-files"> <move source="hdfs://namenode/output/*" target="hdfs://namenode/archive" /> <ok to="next"/> <error to="fail"/> </fs>
By default, up to 1000 files can be matched; you can change this default by means of the oozie.action.fs.glob.max parameter.
Cloudera Search
- Cloudera Search includes a version of Kite 0.10.0, which includes backports of all fixes and features in Kite 0.12.0. For additional information on Kite, see:
<< What's New in CDH 5.0.1 | What's New in CDH 5 Beta 2 >> | |