What's New in HDFS

Learn about the new features of HDFS in Cloudera Runtime 7.3.2, its service packs and cumulative hotfixes.

Cloudera Runtime 7.3.2

Support for G1 Garbage Collector (G1GC) with JDK 17

When running HDFS on JDK 17, all HDFS server processes now automatically use the G1 Garbage Collector (G1GC). No action is required for most deployments.

Hadoop rebase summary

In Cloudera Runtime 7.3.2, Apache Hadoop is rebased to version 3.4.1. The Apache Hadoop upgrade improves overall performance and includes all the new features, improvements, and bug fixes from versions 3.2, 3.3, and 3.4.

Table 1. New features added from Apache Hadoop 3.2 to 3.4 versions
Apache Hadoop version	Apache Jira	Name	Description
3.2	HDFS-10285	Storage Policy Satisfier in HDFS	StoragePolicySatisfier (SPS) allows you to track and fulfill the storage policy requirement of a given file or directory in HDFS. You can specify a file or directory path for SPS to evaluate by running the hdfs storagepolicies -satisfyStoragePolicy -path <path> command or invoking the HdfsAdmin#satisfyStoragePolicy(path) API. If blocks have storage policy mismatches, SPS moves the replicas to a different storage type to fulfill the storage policy requirements. Because API calls go to the NameNode to track the invoked satisfier path (iNodes), you must enable SPS on the NameNode by setting the `dfs.storage.policy.satisfier.mode` property to `external` in the hdfs-site.xml file. You can be disable the configuration dynamically without restarting the NameNode. SPS must be started outside the NameNode using the `hdfs –daemon start sps` command. To run the Mover tool explicitly, you must disable the SPS first. For more information, see the Storage Policy Satisfier (SPS) section in the Archival Storage . By default, Storage Policy Satisfier is disabled in Cloudera HDFS.
3.3	HDFS-14118	DNS resolution for nameservices to IP addresses	HDFS clients can use a single domain name to discover servers such as namenodes, routers, or observers instead of explicitly listing out all hosts in the configurations.
3.3	HDFS-13783	Long-running service process for the Balancer	Adds the -asService parameter to the Balancer CLI. This parameter enables the Balancer to run as a long-running service process for easier monitoring. By default, this service mode is disabled in Cloudera HDFS.
3.3	HADOOP-16398	Hadoop metrics export to Prometheus	If the hadoop.prometheus.endpoint.enabled configuration is set to `true`, Prometheus-friendly formatted metrics can be obtained from the `/prom` endpoint of Hadoop daemons. The default value is `false`. By default, the configuration is disabled in Cloudera HDFS.
3.3	HDFS-13571	Dead node detection	When an unresponsive node blocks a `DFSInputStream`, the system detects the failure and shares this information to other streams in the same DFSClient. This prevents the remaining streams from attempting to read from and be blocked by the dead node .
3.4	HADOOP-17010	Queue capacity weights in `FairCallQueue`	When the `FairCallQueue` feature is enabled, you can specify capacity allocation among all sub-queues using the ipc.<port>.callqueue.capacity.weights configuration. The value of this configuration is a comma-separated list of positive integers, each of which specifies the weight associated with the sub-queue at that index. The number of weights in this list must match the IPC scheduler priority levels defined in `scheduler.priority.levels`. By default, each sub-queue is associated with a weight of 1, meaning all sub-queues are allocated with the same capacity. This configuration is optional in Cloudera HDFS.
3.4	HDFS-13183	Standby NameNode processing for the getBlocks requests to reduce the active load	Enables the balancer to redirect `getBlocks` request to a standby NameNode, thus reducing the performance impact of the balancer to the active NameNode. The feature is disabled by default. To enable it, set the dfs.ha.allow.stale.reads configuration of the balancer to `true` in the hdfs-site.xml file.
3.4	HDFS-15025	NVDIMM storage media support for HDFS	Adds the NVDIMM storage type and the ALL_NVDIMM storage policy for HDFS. The NVDIMM storage type is for non-volatile random-access memory storage media whose data survives when the DataNode restarts.
3.4	HDFS-15098	SM4 encryption method for HDFS	The `SM4/CTR/NoPadding` encryption codec is now added. It requires OpenSSL 1.1.1. or higher version for native implementation.
3.4	HDFS-15747	RBF: Renaming across sub-namespaces	Renames multiple namespaces on the Federation balance tool.
3.4	HDFS-15547	Dynamic disk-level tiering	Archival Storage allows the configuration of DISK and ARCHIVE types on the same device. It enables optimizing disk I/O in heterogeneous environments.
3.4	HDFS-16595	Slow peer metrics	NameNode metrics that represent slownode JSON now include three important factors, namely median, median absolute deviation, and upper latency limit, which can help you determine how urgently a given slownode requires manual intervention.

Table 2. Improvements added from Apache Hadoop 3.2 to 3.4 versions
Apache Hadoop version	Apache Jira	Name	Description
3.3	HDFS-7133	Clearance of the namespace quota on `/`	Namespace quota on root `/` can be cleared now.
3.3	HADOOP-13363	Upgrade of protobuf from 2.5.0 to the highest version	Upgrade protobuf from 2.5.0 to the highest version.
3.3	HADOOP-16670	Removal of Submarine code from Hadoop codebase	The Submarine subproject has been moved to its own Apache Top Level Project. Therefore, the Submarine code has been removed from Hadoop 3.3.0 and higher versions. For more information, see https://submarine.apache.org/.
3.3	HADOOP-17021	Addition of the -concat fs command	The hadoop fs utility includes a -concat command available on all filesystems that support the concat API, including HDFS and WebHDFS.
3.4	HADOOP-17222	Creation of socket address leveraging the URI cache	The DFS client can use the newly added URI cache when creating a socket address for read operations. When enabled, creating a socket address will use cached URI object based on `host:port` to reduce the frequency of URI object creation. To enable it, set the following configuration key to true: `<property> <name>dfs.client.read.uri.cache.enabled</name> <value>true</value> </property>` By default, the configuration is disabled in Cloudera HDFS.
3.4	HDFS-15814	Configurable parameters for DataNodeDiskMetrics	Makes certain parameters configurable for `DataNodeDiskMetrics`.
3.4	HADOOP-16747	Support for Python 3	Supports Python 3.
3.4	HADOOP-17514	Removal of the trace subcommand from the Hadoop CLI	The trace subcommand of the Hadoop CLI has been removed as a follow-up of removal of the TraceAdmin protocol.
3.4	HADOOP-16524	Automatic keystore reloading for HttpServer2	Adds auto-reload of the keystore. Adds the following new configuration (default is 10 seconds): `ssl.{0}.stores.reload.interval` The refresh interval is used to check if either the truststore or keystore certificate file has changed.
3.4	HDFS-15975	Replacement of LongAdder with AtomicLong	This update changes public fields in `DFSHedgedReadMetrics`. If you are using the public member variables of `DFSHedgedReadMetrics`, you must access them using the public API.
3.4	HADOOP-17956	Replacement of all default Charset usage with UTF-8	All of the default charset usages have been replaced to UTF-8. If the default charset of your environment is not UTF-8, the behavior can be different.
3.4	HDFS-16278, HDFS-16285, and HDFS-16419	Cross-platform support for HDFS snapshot tools	Supports HDFS snapshot tools cross-platform.
3.4	HDFS-15382	Division of the single `FsDatasetImpl` lock into volume-grained locks	Throughput is one of the core performance evaluations for the DataNode instance. However, it does not reach the best performance, especially for Federation deployments, all the time because of the global coarse-grain lock. HDFS-16534, HDFS-16511, HDFS-15382 and HDFS-16429 aim to split the global coarse-grain lock into a fine-grain lock, which is a double-level lock for blockpool and volume, to improve the throughput and avoid lock impacts between blockpools and volumes.
3.4	HADOOP-18188	Support for the touch command for directories	Previously, hadoop fs -touch command threw a `PathIsDirectoryException` error for a directory. Now, the command supports directories and will not throw that exception.
3.4	HADOOP-13332	Removal of Jackson 1.9.13 and migration to the 2.x codebase	Remove Jackson 1.9.13 and switch all Jackson code to the 2.x codebase.
3.4	HADOOP-18332	Removal of the `rs-api` dependency by downgrading Jackson to 2.12.7	Downgrades Jackson from 2.13.2 to 2.12.7 to fix class conflicts in downstream projects. This version of Jackson does contain the fix for CVE-2020-36518.
3.4	HADOOP-18442	Removal of the hadoop-openstack module	The `swift://` connector for Openstack support has been removed because of fundamental limitations. such as Swift’s handling of files greater than 4 GB. Because most object store services now export a subset of the S3 protocol, use the s3a connector instead. The `hadoop-openstack` JAR remains as an empty artifact to prevent build failures for projects that still declare the JAR as a dependency.
3.4	HADOOP-16206	Migration from commons-logging to reload4j	Migrates commons-logging to reload4j.
3.4	HADOOP-17524	Removal of EventCounter and Log counters from JVMMetrics	Removes the log counter from JVMMetrics because this feature strongly depends on the Log4J 1.x API.
3.4	HADOOP-18206	Cleanup of commons-logging references in the codebase	Cleans up the commons-logging references in the codebase.
3.4	HADOOP-18820	AWS SDK v2: Optional v1 bridging support	The v1 aws-sdk-bundle JAR is removed. It is only required for third party applications or when using v1 SDK `AWSCredentialsProvider` classes. Standard providers automatically migrate from the v1 to v2 classes. Therefore this change only affects third-party providers or environments using very esoteric classes in the V1 SDK . For more information, see Upgrading S3A to AWS SDK V2.
3.4	HADOOP-18876	ABFS: Default buffer change from disk to bytebuffer for fs.azure.data.blocks.buffer	The default value for the fs.azure.data.blocks.buffer configuration is now changed from `disk` to `bytebuffer`. This speeds up writing to Azure storage, at the risk of running out of memory, especially if many threads are writing to ABFS at the same time and the upload bandwidth is limited. If jobs do run out of memory while writing to ABFS, change the option back to `disk`.
3.4	HADOOP-18993	S3A: Addition of the fs.s3a.classloader.isolation option	To load custom implementations of AWS Credential Providers through user-provided jars, set the {{fs.s3a.extensions.isolated.classloader}} configuration to `{{false}}`.
3.4	HADOOP-19120	ABFS: Adaptation of ApacheHttpClient as a network library	Apache HttpClient 4.5.x is a new implementation of HTTP connections, supporting a large, configurable pool of connections along with the ability to limit their lifespan. Configure the networking library using the fs.azure.networking.library configuration option. The following are the supported values: JDK_HTTP_URL_CONNECTION Uses the JDK networking library [Default] APACHE_HTTP_CLIENT Uses Apache HttpClient important If you switch the networking library to the Apache HttpClient, you must ensure that Apache HttpCore and HttpClient are on the classpath.

Table 3. Issues fixed between Apache Hadoop versions 3.2 to 3.4
Apache Hadoop version	Apache Jira	Name	Description
3.3	HDFS-14396	Image loading failure from FSImageFile during downgrades from 3.x to 2.x	During a rolling upgrade from Hadoop 2.x to 3.x, NameNode could not persist erasure coding information, and therefore a user could not start using the erasure coding feature until the finalize operation was complete.
3.3	HADOOP-15358	SFTPConnectionPool connection leakage	SFTPConnectionPool connection leakage occured.
3.3	HDFS-14845	Bypass of AuthenticationFilterInitializer for HttpFSServerWebServer	HttpFS previously ignored standard Hadoop authentication settings. HttpFS now honors hadoop.http.authentication.* configurations and the previous httpfs.authentication.* configurations are deprecated. If both configurations are set, the deprecated httpfs.authentication.* configurations take precedence for backward compatibility.
3.4	HDFS-15380	RBF: Fetch failure for the real remote IP in RouterWebHdfsMethods	RBF could not fetch the real remote IP in RouterWebHdfsMethods.
3.4	HDFS-15281	ZKFC host binding fallback from the dfs.namenode.rpc-bind-host to dfs.namenode.rpc-address	ZKFC previously ignored the dfs.namenode.rpc-bind-host configuration. ZKFC now correctly binds the host address using dfs.namenode.servicerpc-bind-host if configured, falling back to dfs.namenode.rpc-bind-host. If neither configuration exists, ZKFC binds to the NameNode RPC server address (acting as dfs.namenode.rpc-address configuration).
3.4	HADOOP-18237	Apache Xerces Java upgrade to 2.12.2	An older version of Apache Xerces Java contained vulnerabilities. Apache Xerces has been updated to 2.12.2 to resolve CVE-2022-23437.
3.4	HADOOP-18621	`CryptoOutputStream::close` leakage	A resource leak occurred in `CryptoOutputStream::close` when encountering encrypted zones combined with quota exceptions occur.
3.4	HADOOP-18329	Missing support for Semeru OE JRE 11.0.15.0	Support was missing for Semeru Runtimes, in which, due to vendor name-based logic and changes in the java.vendor property, failures could occur on Java 11 Runtimes 11.0.15.0 and higher versions.
3.4	HADOOP-19221	S3A: Recovery failure for S3A multipart block upload attempts `Status Code: 400; Error Code: RequestTimeout`	S3A upload operations failed to recover when the store returned a 500 error. These operations now successfully recover. You can control this retry behavior for 50x errors, excluding 503 throttling events, which remain independently processed, using the fs.s3a.retry.http.5xx.errors property. Th default value is `true`.

Hadoop interface standards and API changes

Hadoop provides many compatibility promises, including the following guarantees:

Java APIs marked Stable must not change within a major release.
Hadoop wire protocols are classified as Private and Stable, and remain forward-compatible across minor releases within a specific major version.
The exposed Hadoop REST APIs are generally classified as Public and Evolving. However, within a specific API version number, they are treated as Public and Stable, meaning no incompatible changes are permitted.

The following library changes affect Cloudera components:

The commons-lang (2) has been migrated to commons-lang3, which uses a new package name. Dependent Cloudera components must update their imports.
Guava and some instances of Protobuf now come as shaded JARs from hadoop-thirdparty.
A new bouncycastle version, including Java 17 and 21 class variants, is now available. Tools such as Gradle and maven-shade-plugin must be updated to higher versions or excluded when shading.
Switched from javax.activation to jakarta.activation.
org.apache.kerby.kerb-simplekdc is no longer imported by Hadoop, so components that depend on it must explicitly include it.
Migrated from org.codehaus.jackson (1.x) to com.fasterxml.jackson (2.x).
The Hadoop EventCounter class was removed in HADOOP-17524 with no replacement because it relied on deprecated log4j 1.x functionalitynot present in 2.x.
Removed commons-logging. Hadoop now exclusively configures and uses reload4j now.

The following Java API changes affect Cloudera components:

HADOOP-16645 – Switches S3A APIs to return a StoreContext rather than provide direct access to S3AFileSystem.
HADOOP-18206 – Removed the IOUtils#cleanup method from Public-Evolving API as part of the migration away from commons-logging.
HDFS-16982 – Renamed MutableQuantiles.quantiles to MutableQuantiles.QUANTILES.