This section describes starting Core Hadoop and doing simple smoke tests. Use the following instructions to validate core Hadoop installation:
Format and start HDFS.
Execute these commands on the NameNode:
su $HDFS_USER /usr/lib/hadoop/bin/hadoop namenode -format /usr/lib/hadoop/bin/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode
Execute these commands on the Secondary NameNode :
su $HDFS_USER /usr/lib/hadoop/bin/hadoop-daemon.sh --config $HADOOP_CONF_DIR start secondarynamenode
Execute these commands on all DataNodes:
su $HDFS_USER /usr/lib/hadoop/bin/hadoop-daemon.sh --config $HADOOP_CONF_DIR start datanode
where:
$HDFS_USER
is the user owning the HDFS services. For example,hdfs
.$HADOOP_CONF_DIR
is the directory for storing the Hadoop configuration files. For example,/etc/hadoop/conf
.
Smoke Test HDFS.
See if you can reach the NameNode server with your browser:
http://$namenode.full.hostname:50070
Try copying a file into HDFS and listing that file:
su $HDFS_USER /usr/lib/hadoop/bin/hadoop dfs -copyFromLocal /etc/passwd passwd-test /usr/lib/hadoop/bin/hadoop dfs -ls
Test browsing HDFS:
http://$datanode.full.hostname:50075/browseDirectory.jsp?dir=/
Start MapReduce.
Execute these commands from the JobTracker server:
su $HDFS_USER /usr/lib/hadoop/bin/hadoop fs -mkdir /mapred /usr/lib/hadoop/bin/hadoop fs -chown -R mapred /mapred
su $MAPRED_USER /usr/lib/hadoop/bin/hadoop-daemon.sh --config $HADOOP_CONF_DIR start jobtracker
Execute these commands from the JobHistory server:
su $MAPRED_USER /usr/lib/hadoop/bin/hadoop-daemon.sh --config $HADOOP_CONF_DIR start historyserver
Execute these commands from all TaskTracker nodes:
su $MAPRED_USER /usr/lib/hadoop/bin/hadoop-daemon.sh --config $HADOOP_CONF_DIR start tasktracker
where:
$HDFS_USER
is the user owning the HDFS services. For example,hdfs
.$MAPRED_USER
is the user owning the MapReduce services. For example,mapred
.$HADOOP_CONF_DIR
is the directory for storing the Hadoop configuration files. For example,/etc/hadoop/conf
.
Smoke Test MapReduce.
Try browsing to the JobTracker:
http://$jobtracker.full.hostname:50030/
Smoke test using Teragen (to generate 10GB of data) and then using Terasort to sort the data.
sus $HDFS_USER /usr/lib/hadoop/bin/hadoop jar /usr/lib/hadoop/hadoop-examples.jar teragen 100000000 /test/10gsort/input /usr/lib/hadoop/bin/hadoop jar /usr/lib/hadoop/hadoop-examples.jar terasort /test/10gsort/input /test/10gsort/output