Configuring HDFS rack awareness
In a CDP Data Center cluster, the NameNode maintains rack IDs of all the DataNodes. The NameNode uses this information about the distribution of DataNodes among various racks in the cluster to select the closer DataNodes for effective block placement during read or write operations. This concept of selecting the closer DataNodes based on their location in the cluster is termed as rack awareness. Rack awareness helps in maintaining fault tolerance in the event of a failure.
core-site.xml, restarting HDFS, and verifying the rack awareness.