HDP-2.3.2 Release Notes
Also available as:
PDF

Hadoop

HDP 2.3.2 provides the following Apache patches:

NEW FEATURES

IMPROVEMENTS

  • HADOOP-10597 RPC Server signals backoff to clients when all request queues are full.

  • HADOOP-11960 Enable Azure-Storage Client Side logging.

  • HADOOP-12325 RPC Metrics: Add the ability track and log slow RPCs.

  • HADOOP-12358 Add -safely flag to rm to prompt when deleting many files.

  • HDFS-4185 Add a metric for number of active leases.

  • HDFS-4396 Add START_MSG/SHUTDOWN_MSG for ZKFC.

  • HDFS-6860 BlockStateChange logs are too noisy.

  • HDFS-7923 The DataNodes should rate-limit their full block reports byasking the NN on heartbeat messages.

  • HDFS-8046 Allow better control of getContentSummary.

  • HDFS-8180 AbstractFileSystem Implementation for WebHdfs.

  • HDFS-8278 When computing max-size-to-move in Balancer, count only the storage with remaining >= default block size.

  • HDFS-8432 Introduce a minimum compatible layout version to allow downgrade in more rolling upgrade use cases.

  • HDFS-8435 Support CreateFlag in WebHDFS.

  • HDFS-8549 Abort the balancer if an upgrade is in progress.

  • HDFS-8797 WebHdfsFileSystem creates too many connections for pread.

  • HDFS-8818 Changes the global moveExecutor to per datanode executors and changes MAX_SIZE_TO_MOVE to be configurable.

  • HDFS-8824 Do not use small blocks for balancing the cluster.

  • HDFS-8826 In Balancer, add an option to specify the source node list so that balancer only selects blocks to move from those nodes.

  • HDFS-8883 NameNode Metrics: Add FSNameSystem lock Queue Length.

  • HDFS-8911 NameNode Metric Add Editlog counters as a JMX metric.

  • HDFS-8983 NameNode support for protected directories.

  • HDFS-8983 NameNode support for protected directories.

  • YARN-2513 Host framework UIs in YARN for use with the ATS.

  • YARN-3197 Confusing log generated by CapacityScheduler.

  • YARN-3357 Move TestFifoScheduler to FIFO package.

  • YARN-3360 Add JMX metrics to TimelineDataManager.

  • YARN-3579 CommonNodeLabelsManager should support NodeLabel instead of string label name when getting node-to-label/label-to-label mappings.

  • YARN-3978 Configurably turn off the saving of container info in Generic AHS.

  • YARN-4082 Container shouldn't be killed when node's label updated.

  • YARN-4101 RM should print alert messages if ZooKeeper and Resourcemanager gets connection issue.

  • YARN-4149 yarn logs -am should provide an option to fetch all the log files.

BUG FIXES

  • HADOOP-11802 DomainSocketWatcher thread terminates sometimes after there is an I/O error during requestShortCircuitShm.

  • HADOOP-12052 IPC client downgrades all exception types to IOE, breaks callers trying to use them.

  • HADOOP-12073 Azure FileSystem PageBlobInputStream does not return -1 onEOF.

  • HADOOP-12095 org.apache.hadoop.fs.shell.TestCount fails.

  • HADOOP-12304 Applications using FileContext fail with the default filesystem configured to be wasb/s3/etc.

  • HADOOP-8151 Error handling in snappy decompressor throws invalidexceptions.

  • HDFS-6945 BlockManager should remove a block from excessReplicateMap anddecrement ExcessBlocks metric when the block is removed.

  • HDFS-7608 hdfs dfsclient newConnectedPeer has nowrite timeout.

  • HDFS-7609 Avoid retry cache collision when Standby NameNode loading edits.

  • HDFS-8309 Skip unit test using DataNodeTestUtils#injectDataDirFailure() on Windows.

  • HDFS-8310 Fix TestCLI.testAll "help for find" on Windows.

  • HDFS-8311 DataStreamer.transfer() should timeout the socket InputStream.

  • HDFS-8384 Allow NN to startup if there are files having a lease but are notunder construction.

  • HDFS-8431 hdfs crypto class not found in Windows.

  • HDFS-8539 Hdfs doesn’t have class 'debug' in windows.

  • HDFS-8542 WebHDFS getHomeDirectory behavior does not match specification.

  • HDFS-8593 Calculation of effective layout version mishandles comparison to current layout version in storage.

  • HDFS-8767 RawLocalFileSystem.listStatus() returns null for UNIX pipefile.

  • HDFS-8850 VolumeScanner thread exits with exception if there is no blockpool to be scanned but there are suspicious blocks.

  • HDFS-8863 The remaining space check in BlockPlacementPolicyDefault is flawed.

  • HDFS-8879 Quota by storage type usage incorrectly initialized upon namenoderestart.

  • HDFS-8885 ByteRangeInputStream used in webhdfs does not overrideavailable().

  • HDFS-8932 NPE thrown in NameNode when try to get TotalSyncCount metricbefore editLogStream initialization.

  • HDFS-8939 Test(S)WebHdfsFileContextMainOperations failing on branch-2.

  • HDFS-8969 Clean up findbugs warnings for HDFS-8823 and HDFS-8932.

  • HDFS-8995 Flaw in registration bookkeeping can make DN die on reconnect.

  • HDFS-9009 Send metrics logs to NullAppender by default.

  • YARN-3413 Changed Nodelabel attributes (like exclusivity) to be settable only via addToClusterNodeLabelsbut not changeable at runtime.

  • YARN-3885 ProportionalCapacityPreemptionPolicy doesn't preempt if queue is more than 2 level.

  • YARN-3894 RM startup should fail for wrong CS xml NodeLabel capacity configuration.

  • YARN-3896 RMNode transitioned from RUNNING to REBOOTED because its response idhas not been reset synchronously.

  • YARN-3932 SchedulerApplicationAttempt#getResourceUsageReport and UserInfo should based on total-used-resources.

  • YARN-3971 Skip RMNodeLabelsManager#checkRemoveFromClusterNodeLabelsOfQueue on nodelabel recovery.

  • YARN-4087 Followup fixes after YARN-2019 regarding RM behavior when state-store error occurs.

  • YARN-4092 Fixed UI redirection to print useful messages when both RMs are in standby mode.

OPTIMIZATION

  • HADOOP-11772 RPC Invoker relies on static ClientCache which has synchronized(this) blocks.

  • HADOOP-12317 Applications fail on NM restart on some Linux distro because NM container recovery declares AM container as LOST.

  • HADOOP-7713 dfs -count -q should label output column.

  • HDFS-8856 Make LeaseManager#countPath O(1).

  • HDFS-8867 Enable optimized block reports.

HDP 2.3.0 provided the following Apache patches:

NEW FEATURES

  • HDFS-8008 Support client-side back off when the datanodes are congested.

  • HDFS-8009 Signal congestion on the DataNode.

  • YARN-1376 NM need to notify the log aggregation status to RM through heartbeat.

  • YARN-1402 Update related Web UI and CLI with exposing client API to check log aggregation status.

  • YARN-2498 Respect labels in preemption policy of capacity scheduler for inter-queue preemption.

  • YARN-2571 RM to support YARN registry

  • YARN-2619 Added NodeManager support for disk IO isolation through cgroups.

  • YARN-3225 New parameter of CLI for decommissioning node gracefully in RMAdmin CLI.

  • YARN-3318 Create Initial OrderingPolicy Framework and FifoOrderingPolicy.

  • YARN-3319 Implement a FairOrderingPolicy.

  • YARN-3326 Support RESTful API for getLabelsToNodes.

  • YARN-3345 Add non-exclusive node label API.

  • YARN-3347 Improve YARN log command to get AMContainer logs as well as running containers logs.

  • YARN-3348 Add a 'yarn top' tool to help understand cluster usage.

  • YARN-3354 Add node label expression in ContainerTokenIdentifier to support RM recovery.

  • YARN-3361 CapacityScheduler side changes to support non-exclusive node labels.

  • YARN-3365 Enhanced NodeManager to support using the 'tc' tool via container-executor for outbound network traffic control.

  • YARN-3366 Enhanced NodeManager to support classifying/shaping outgoing network bandwidth traffic originating from YARN containers

  • YARN-3410 YARN admin should be able to remove individual application records from RMStateStore.

  • YARN-3443 Create a 'ResourceHandler' subsystem to ease addition of support for new resource types on the NM.

  • YARN-3448 Added a rolling time-to-live LevelDB timeline store implementation.

  • YARN-3463 Integrate OrderingPolicy Framework with CapacityScheduler.

  • YARN-3505 Node's Log Aggregation Report with SUCCEED should not cached in RMApps.

  • YARN-3541 Add version info on timeline service / generic history web UI and REST API.

IMPROVEMENTS

  • HADOOP-10597 RPC Server signals backoff to clients when all request queues are full.

  • YARN-1880 Cleanup TestApplicationClientProtocolOnHA

  • YARN-2495 Allow admin specify labels from each NM (Distributed configuration for node label).

  • YARN-2696 Queue sorting in CapacityScheduler should consider node label.

  • YARN-2868 FairScheduler: Metric for latency to allocate first container for an application.

  • YARN-2901 Add errors and warning metrics page to RM, NM web UI.

  • YARN-3243 CapacityScheduler should pass headroom from parent to children to make sure ParentQueue obey its capacity limits.

  • YARN-3248 Display count of nodes blacklisted by apps in the web UI.

  • YARN-3293 Track and display capacity scheduler health metrics in web UI.

  • YARN-3294 Allow dumping of Capacity Scheduler debug logs via web UI for a fixed time period.

  • YARN-3356 Capacity Scheduler FiCaSchedulerApp should use ResourceUsage to track used-resources-by-label.

  • YARN-3362 Add node label usage in RM CapacityScheduler web UI.

  • YARN-3394 Enrich WebApplication proxy documentation.

  • YARN-3397 yarn rmadmin should skip -failover.

  • YARN-3404 Display queue name on application page.

  • YARN-3406 Display count of running containers in the RM's Web UI.

  • YARN-3451 Display attempt start time and elapsed time on the web UI.

  • YARN-3494 Expose AM resource limit and usage in CS QueueMetrics.

  • YARN-3503 Expose disk utilization percentage and bad local and log dir counts in NM metrics.

  • YARN-3511 Add errors and warnings page to ATS.

  • YARN-3565 NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object instead of String.

  • YARN-3581 Deprecate -directlyAccessNodeLabelStore in RMAdminCLI.

  • YARN-3583 Support of NodeLabel object instead of plain String in YarnClient side.

  • YARN-3593 Add label-type and Improve "DEFAULT_PARTITION" in Node Labels Page.

  • YARN-3700 Made generic history service load a number of latest applications according to the parameter or the configuration.

BUG FIXES

  • HADOOP-11859 PseudoAuthenticationHandler fails with httpcomponents v4.4.

  • HADOOP-7713 dfs -count -q should label output column

  • HDFS-27 HDFS CLI with --config set to default config complains log file not found error.

  • HDFS-6666 Abort NameNode and DataNode startup if security is enabled but block access token is not enabled.

  • HDFS-7645 Fix CHANGES.txt

  • HDFS-7645 Rolling upgrade is restoring blocks from trash multiple times

  • HDFS-7701 Support reporting per storage type quota and usage with hadoop/hdfs shell.

  • HDFS-7890 Improve information on Top users for metrics in RollingWindowsManager and lower log level.

  • HDFS-7933 fsck should also report decommissioning replicas.

  • HDFS-7990 IBR delete ack should not be delayed.

  • HDFS-8008 Support client-side back off when the datanodes are congested.

  • HDFS-8009 Signal congestion on the DataNode.

  • HDFS-8055 NullPointerException when topology script is missing.

  • HDFS-8144 Split TestLazyPersistFiles into multiple tests.

  • HDFS-8152 Refactoring of lazy persist storage cases.

  • HDFS-8205 CommandFormat#parse() should not parse option as value of option.

  • HDFS-8211 DataNode UUID is always null in the JMX counter.

  • HDFS-8219 setStoragePolicy with folder behavior is different after cluster restart.

  • HDFS-8229 LAZY_PERSIST file gets deleted after NameNode restart.

  • HDFS-8232 Missing datanode counters when using Metrics2 sink interface.

  • HDFS-8276 LazyPersistFileScrubber should be disabled if scrubber interval configured zero.

  • YARN-2666 TestFairScheduler.testContinuousScheduling fails Intermittently.

  • YARN-2740 Fix NodeLabelsManager to properly handle node label modifications when distributed node label configuration enabled.

  • YARN-2821 Fixed a problem that DistributedShell AM may hang if restarted.

  • YARN-3110 Few issues in ApplicationHistory web UI.

  • YARN-3136 Fixed a synchronization problem of AbstractYarnScheduler#getTransferredContainers.

  • YARN-3266 RMContext#inactiveNodes should have NodeId as map key.

  • YARN-3269 Yarn.nodemanager.remote-app-log-dir could not be configured to fully qualified path.

  • YARN-3305 Normalize AM resource request on app submission.

  • YARN-3343 Increased TestCapacitySchedulerNodeLabelUpdate#testNodeUpdate timeout.

  • YARN-3383 AdminService should use "warn" instead of "info" to log exception when operation fails.

  • YARN-3387 Previous AM's container completed status couldn't pass to current AM if AM and RM restarted during the same time.

  • YARN-3425 NPE from RMNodeLabelsManager.serviceStop when NodeLabelsManager.serviceInit failed.

  • YARN-3435 AM container to be allocated Appattempt AM container shown as null.

  • YARN-3459 Fix failure of TestLog4jWarningErrorMetricsAppender.

  • YARN-3517 RM web UI for dumping scheduler logs should be for admins only

  • YARN-3530 ATS throws exception on trying to filter results without otherinfo.

  • YARN-3552 RM Web UI shows -1 running containers for completed apps

  • YARN-3580 [JDK8] TestClientRMService.testGetLabelsToNodes fails.

  • YARN-3632 Ordering policy should be allowed to reorder an application when demand changes.

  • YARN-3654 ContainerLogsPage web UI should not have meta-refresh.

  • YARN-3707 RM Web UI queue filter doesn't work.

  • YARN-3740 Fixed the typo in the configuration name: APPLICATION_HISTORY_PREFIX_MAX_APPS.