Hadoop Common/HDFS 2.6.0
HDP 2.2.9 provides Apache Hadoop Core 2.6.0 and the following additional Apache patches:
HADOOP-10728: Metrics system for Windows Azure Storage Filesystem.
HADOOP-10809: HADOOP azure: page blob support.
HADOOP-10839: Add unregisterSource() to MetricsSystem API.
HADOOP-11000: HAServiceProtocol's health state is incorrectly transitioned to SERVICE_NOT_RESPONDING.
HADOOP-11032: Replace use of Guava's Stopwatch with Hadoop's StopWatch.
HADOOP-11188: HADOOP azure: automatically expand page blobs when they become full.
HADOOP-11291: Log the cause of SASL connection failures.
HADOOP-11321: copyToLocal cannot save a file to an SMB share unless the user has Full Control permissions.
HADOOP-11333: Fix deadlock in DomainSocketWatcher when the notification pipe is full.
HADOOP-11390: Metrics 2 ganglia provider to include hostname in unresolved address problems.
HADOOP-11441: HADOOP azure: Change few methods scope to public.
HADOOP-11442: HADOOP azure: Create test jar.
HADOOP-11490: Expose truncate API via FileSystem and shell command.
HADOOP-11509: change parsing sequence in GenericOptionsParser to parse -D parameters before -files.
HADOOP-11595: Add default implementation for AbstractFileSystem#truncate.
HADOOP-11642: Upgrade azure SDK version from 0.6.0 to 2.0.0.
HADOOP-11648: Set DomainSocketWatcher thread name explicitly.
HADOOP-11685: StorageException complaining " no lease ID" during HBase distributed log splitting.
HADOOP-11730: Regression: s3n read failure recovery broken.
HADOOP-11918: Listing an empty s3a root directory throws FileNotFound.
HADOOP-11960: Enable Azure-Storage Client Side logging.
HADOOP-12089: StorageException complaining " no lease ID" when updating FolderLastModifiedTime in WASB.
HADOOP-12186: ActiveStandbyElector shouldn't call monitorLockNodeAsync multiple times.
HADOOP-12239: StorageException complaining " no lease ID" when updating FolderLastModifiedTime in WASB.
HADOOP-12318: Expose underlying LDAP exceptions in SaslPlainServer.
HADOOP-12324: Better exception reporting in SaslPlainServer.
HADOOP-12325: RPC Metrics : Add the ability track and log slow RPCs.
HADOOP-12334: Change Mode Of Copy Operation of HBase WAL Archiving to bypass Azure Storage Throttling after retries.
HADOOP-12350: WASB Logging: Improve WASB Logging around deletes, reads and writes.
HADOOP-12358: Add -safely flag to rm to prompt when deleting many files.
HADOOP-12407: Test failing: hadoop.ipc.TestSaslRPC.
HADOOP-12413: AccessControlList should avoid calling getGroupNames in isUserInList with empty groups.
HADOOP-12437: Allow SecurityUtil to lookup alternate hostnames.
HADOOP-12438: TestLocalFileSystem tests can fail on Windows after HDFS-8767 fix for handling pipe.
HADOOP-12542: TestDNS fails on Windows after HADOOP-12437.
HADOOP-7984: Add hadoop --loglevel option to change log level.
HADOOP-9629: Support Windows Azure Storage - Blob as a file system in Hadoop.
HDFS-4015: Safemode should count and report orphaned blocks.
HDFS-4366: Block Replication Policy Implementation May Skip Higher-Priority Blocks for Lower-Priority Blocks.
HDFS-4396: Add START_MSG/SHUTDOWN_MSG for ZKFC.
HDFS-4660: Block corruption can happen during pipeline recovery.
HDFS-4937: ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom.
HDFS-5782: BlockListAsLongs should take lists of Replicas rather than concrete class.
HDFS-6481: DatanodeManager#getDatanodeStorageInfos() should check the length of storageIDs.
HDFS-6581: Support for writing to single replica in RAM.
HDFS-6663: Admin command to track file and locations from block id.
HDFS-6860: BlockStateChange logs are too noisy.
HDFS-6917: Add an hdfs debug command to validate blocks, call recoverlease, etc.
HDFS-6945: BlockManager should remove a block from excessReplicateMap and decrement ExcessBlocks metric when the block is removed.
HDFS-6982: nntop: top-like tool for name node users.
HDFS-7009: Active NN and standby NN have different live nodes.
HDFS-7097: Allow block reports to be processed during checkpointing on standby name node.
HDFS-7153: Add storagePolicy to NN edit log during file creation.
HDFS-7182: JMX metrics aren't accessible when NN is busy.
HDFS-7213: processIncrementalBlockReport performance degradation.
HDFS-7222: Expose DataNode network errors as a metric.
HDFS-7263: Snapshot read can reveal future bytes for appended files.
HDFS-7278: Add a command that allows sysadmins to manually trigger full block reports from a DN .
HDFS-7435: PB encoding of block reports is very inefficient.
HDFS-7448: TestBookKeeperHACheckpoints fails in trunk build.
HDFS-7491: Add incremental blockreport latency to DN metrics.
HDFS-7575: Upgrade should generate a unique storage ID for each volume.
HDFS-7579: Improve log reporting during block report rpc failure.
HDFS-7596: NameNode should prune dead storages from storageMap.
HDFS-7604: Track and display failed DataNode storage locations in NameNode.
HDFS-7608: hdfs dfsclient newConnectedPeer has no write timeout.
HDFS-7611: deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart.
HDFS-7634: Disallow truncation of Lazy persist files.
HDFS-7643: Test case to ensure lazy persist files cannot be truncated.
HDFS-7659: truncate should check negative value of the new length.
HDFS-7704: DN heartbeat to Active NN may be blocked and expire if connection to Standby NN continues to time out.
HDFS-7707: Edit log corruption due to delayed block removal again.
HDFS-7738: Revise the exception message for recover lease; add more truncate tests such as truncate with HA setup, negative tests, truncate with other operations and multiple truncates.
HDFS-7858: Improve HA Namenode Failover detection on the client.
HDFS-7928: Scanning blocks from disk during rolling upgrade startup takes a lot of time if disks are busy.
HDFS-7931: DistributedFileSystem should not look for keyProvider in cache if Encryption is disable.
HDFS-7933: fsck should also report decommissioning replicas.
HDFS-7980: Incremental BlockReport will dramatically slow down the startup of a namenode.
HDFS-8055: NullPointerException when topology script is missing.
HDFS-8070: Pre-HDFS-7915 DFSClient cannot use short circuit on post-HDFS-7915 DataNode.
HDFS-8127: NameNode Failover during HA upgrade can cause DataNode to finalize upgrade.
HDFS-8155: Support OAuth2 in WebHDFS.
HDFS-8163: Using monotonicNow for block report scheduling causes test failures on recently restarted systems.
HDFS-8180: AbstractFileSystem Implementation for WebHdfs.
HDFS-8219: setStoragePolicy with folder behavior is different after cluster restart.
HDFS-8270: create() always retried with hardcoded timeout when file already exists with open lease.
HDFS-8384: Allow NN to startup if there are files having a lease but are not under construction.
HDFS-8435: Support CreateFlag in WebHdfs.
HDFS-8542: WebHDFS getHomeDirectory behavior does not match specification.
HDFS-8554: TestDatanodeLayoutUpgrade fails on Windows.
HDFS-8576: Lease recovery should return true if the lease can be released and the file can be closed.
HDFS-8722: Optimize datanode writes for small writes and flushes.
HDFS-8785: TestDistributedFileSystem is failing in trunk.
HDFS-8797: WebHdfsFileSystem creates too many connections for pread.
HDFS-8826: In Balancer, add an option to specify the source node list so that balancer only selects blocks to move from those nodes.
HDFS-8883: NameNode Metrics : Add FSNameSystem lock Queue Length.
HDFS-8885: ByteRangeInputStream used in webhdfs does not override available().
HDFS-8939: Test(S)WebHdfsFileContextMainOperations failing on branch-2.
HDFS-8950: NameNode refresh doesn't remove DataNodes that are no longer in the allowed list.
HDFS-8965: Harden edit log reading code against out of memory errors.
HDFS-8995: Flaw in registration bookkeeping can make DN die on reconnect.
HDFS-9009: Send metrics logs to NullAppender by default.
HDFS-9082: Change the log level in WebHdfsFileSystem.initialize() from INFO to DEBUG.
HDFS-9083: Replication violates block placement policy.
HDFS-9106: Transfer failure during pipeline recovery causes permanent write failures.
HDFS-9107: Prevent NN's unrecoverable death spiral after full GC.
HDFS-9109: Adding informative message to sticky bit permission denied exception.
HDFS-9112: Improve error message for Haadmin when multiple name service IDs are configured.
HDFS-9128: TestWebHdfsFileContextMainOperations and TestSWebHdfsFileContextMainOperations fail due to invalid HDFS path on Windows.
HDFS-9142: Separating Configuration object for namenode(s) in MiniDFSCluster.
HDFS-9175: Change scope of 'AccessTokenProvider.getAccessToken()' and 'CredentialBasedAccessTokenProvider.getCredential()' abstract methods to public.
HDFS-9178: Slow datanode I/O can cause a wrong node to be marked bad.
HDFS-9184: Logging HDFS operation's caller context into audit log.
HDFS-9205: Do not schedule corrupt blocks for replication.
HDFS-9220: Reading small file (greater than 512 bytes) that is open for append fails due to incorrect checksum.
HDFS-9273: ACLs on root directory may be lost after NN restart.
HDFS-9305: Delayed heartbeat processing causes rapid storm of subsequent heartbeat messages.
HDFS-9311: Support optional offload of NameNode HA service health checks to a separate RPC server.
HDFS-9343: Empty caller context considered invalid.
HDFS-9354: Fix TestBalancer#testBalancerWithZeroThreadsForMove on Windows.
HDFS-9362: TestAuditLogger#testAuditLoggerWithCallContext assumes Unix line endings, fails on Windows.
HDFS-9384: TestWebHdfsContentLength intermittently hangs and fails due to TCP conversation mismatch between client and server.
MAPREDUCE-4815: Speed up FileOutputCommitter#commitJob for many output files.
MAPREDUCE-6238: MR2 can't run local jobs with -libjars command options which is a regression from MR1.
MAPREDUCE-6442: Stack trace is missing when error occurs in client protocol provider's constructor.
YARN-1979: TestDirectoryCollection fails when the umask is unusual.
YARN-1984: LeveldbTimelineStore does not handle db exceptions properly.
YARN-2165: Timeline server should validate the numeric configuration values.
YARN-2246: Made the proxy tracking URL always be http(s)://proxy addr:port/proxy/appId to avoid duplicate sections.
YARN-2254: TestRMWebServicesAppsModification should run against both CS and FS.
YARN-2513: Host framework UIs in YARN for use with the ATS.
YARN-2605: RM HA Rest api endpoints doing redirect incorrectly.
YARN-2816: NM fail to start with NPE during container recovery.
YARN-2821: Distributed shell app master becomes unresponsive sometimes.
YARN-2906: CapacitySchedulerPage shows HTML tags for a queue's Active Users.
YARN-2922: ConcurrentModificationException in CapacityScheduler's LeafQueue.
YARN-3197: Confusing log generated by CapacitySchedule.
YARN-3207: secondary filter matches entities which do not have the key being filtered for.
YARN-3227: Timeline renew delegation token fails when RM user's TGT is expire.
YARN-3238: Connection timeouts to nodemanagers are retried at multiple levels.
YARN-3267: Timelineserver applies the ACL rules after applying the limit on the number of records.
YARN-3351: AppMaster tracking URL is broken in HA.
YARN-3448: Add Rolling Time To Lives Level DB Plugin Capabilities.
YARN-3469: ZKRMStateStore: Avoid setting watches that are not required.
YARN-3487: CapacityScheduler scheduler lock obtained unnecessarily when calling getQueue.
YARN-3526: ApplicationMaster tracking URL is incorrectly redirected on a QJM cluster.
YARN-3530: ATS throws exception on trying to filter results without otherinfo.
YARN-3654: ContainerLogsPage web UI should not have meta-refresh.
YARN-3700: Made generic history service load a number of latest applications according to the parameter or the configuration.
YARN-3766: Fixed the apps table column error of generic history web UI.
YARN-3787: Allowed generic history service to load a number of applications whose started time is within the given range.
YARN-3804: Both RM are on standBy state when kerberos user not in yarn.admin.acl.
YARN-3978: Configurably turn off the saving of container info in Generic AHS.
YARN-4087: Followup fixes after YARN-2019 regarding RM behavior when state- store error occurs.
YARN-4105: Capacity Scheduler headroom for DRF is wrong.
YARN-4243: Add retry on establishing ZooKeeper connection in EmbeddedElectorService#serviceInit.
YARN-4313: Race condition in MiniMRYarnCluster when getting history server address.
HDP 2.2.8 provided Apache Hadoop Core 2.6.0 and the following additional Apache patches:
HADOOP-11321: copyToLocal cannot save a file to an SMB share unless the user has Full Control permissions.
HADOOP-11368: Fix SSLFactory truststore reloader thread leak in KMSClientProvider.
HADOOP-11381: Fix findbugs warnings in hadoop-distcp, hadoop-aws, HADOOP azure, and hadoop-openstack
HADOOP-11412: POMs mention "The Apache Software License" rather than "Apache License".
HADOOP-11490: Expose truncate API via FileSystem and shell command.
HADOOP-11509: change parsing sequence in GenericOptionsParser to parse -D parameters before -files.
HADOOP-11510: Expose truncate API via FileContext.
HADOOP-11523: StorageException complaining " no lease ID" when updating FolderLastModifiedTime in WASB.
HADOOP-11579: Documentation for truncate.
HADOOP-11595: Add default implementation for AbstractFileSystem#truncate.
HADOOP-926: Do not fail job history iteration when encountering missing directories.
HADOOP-941: Addendum patch.
HDFS-3107: Introduce truncate.
HDFS-7009: Active NN and standby NN have different live nodes.
HDFS-7056: Snapshot support for truncate.
HDFS-7058: Tests for truncate CLI
HDFS-7263: Snapshot read can reveal future bytes for appended files.
HDFS-7425: NameNode block deletion logging uses incorrect appender.
HDFS-7443: Datanode upgrade to BLOCKID_BASED_LAYOUT fails if duplicate block files are present in the same volume
HDFS-7470: SecondaryNameNode need twice memory when calling reloadFromImageFile.
HDFS-7489: Incorrect locking in FsVolumeList#checkDirs can hang datanodes.
HDFS-7503: Namenode restart after large deletions can cause slow processReport
HDFS-7606: Fix potential NPE in INodeFile.getBlocks().
HDFS-7634: Disallow truncation of Lazy persist files.
HDFS-7638: Small fix and few refinements for FSN#truncate.
HDFS-7643: Test case to ensure lazy persist files cannot be truncated.
HDFS-7655: Expose truncate API for Web HDFS.
HDFS-7659: Truncate should check negative value of the new length
HDFS-7676: Fix TestFileTruncate to avoid bug of HDFS-7611.
HDFS-7677: DistributedFileSystem#truncate should resolve symlinks.
HDFS-7707: Edit log corruption due to delayed block removal again.
HDFS-7714: Simultaneous restart of HA NameNodes and DataNode can cause DataNode to register successfully with only one NameNode.
HDFS-7733: NFS: readdir/readdirplus return null directory attribute on failure.
HDFS-7738: Revise the exception message for recover lease; add more truncate tests such as truncate with HA setup, negative tests, truncate with other operations and multiple truncates.
HDFS-7760: Document truncate for WebHDFS.
HDFS-7831 Fix the starting index and end condition of the loop in FileDiffList.findEarlierSnapshotBlocks().
HDFS-7843: A truncated file is corrupted after rollback from a rolling upgrade.
HDFS-7885: Datanode should not trust the generation stamp provided by client.
MAPREDUCE-6230: Fixed RMContainerAllocator to update the new AMRMToken service name properly.
YARN-2246: Made the proxy tracking URL always be http(s)://proxy addr:port/proxy/<appId> to avoid duplicate sections.
YARN-2571: RM to support YARN registry
YARN-2683: registry config options: document and move to core-default
YARN-2837: Support TimeLine server to recover delegation token when restarting.
YARN-2917: Fixed potential deadlock when system.exit is called in AsyncDispatcher
YARN-2964: RM prematurely cancels tokens for jobs that submit jobs (oozie).
YARN-3103: AMRMClientImpl does not update AMRM token properly.
YARN-3207: Secondary filter matches entities which do not have the key being filtered for.
YARN-3227: Timeline renew delegation token fails when RM user's TGT is expired.
YARN-3239: WebAppProxy does not support a final tracking URL which has query fragments and params.
YARN-3251: Fixed a deadlock in CapacityScheduler when computing absoluteMaxAvailableCapacity in LeafQueue.
YARN-3269: Yarn.nodemanager.remote-app-log-dir could not be configured to fully qualified path.
YARN-570: Time strings are formatted in different timezone.