Topology scripts are used by Hadoop to determine the rack location of nodes. This information is used by Hadoop to replicate block data to redundant racks.
Create a topology script and data file.
Sample Topology Script
File name:
rack-topology.sh
HADOOP_CONF=/etc/hadoop/conf while [ $# -gt 0 ] ; do nodeArg=$1 exec< ${HADOOP_CONF}/topology.data result="" while read line ; do ar=( $line ) if [ "${ar[0]}" = "$nodeArg" ] ; then result="${ar[1]}" fi done shift if [ -z "$result" ] ; then echo -n "/default/rack " else echo -n "$result " fi done
Sample Topology Data File
File name:
topology.data
hadoopdata1.ec.com /dc1/rack1 hadoopdata1 /dc1/rack1 10.1.1.1 /dc1/rack2
Copy both of these files to
/etc/hadoop/conf
.Run the
rack-topology.sh
script to ensure that it returns the correct rack information for each host.