Chapter 16. Installing Apache Mahout

Install Apache Mahout on the machine that will run it, either the Hadoop node or your client environment. Do not install it on every node in your cluster.

To install the Mahout RPM, use the following command:

  • RHEL/CentOS/Oracle Linux:

    yum install mahout

  • For SLES:

    zypper install mahout

  • For Ubuntu and Debian:

    apt-get install mahout

To validate Mahout:

  1. Create a test user:

    hdfs dfs -put /tmp/sample-test.txt /user/testuser

  2. Create a mahout test output directory:

    hdfs dfs -mkdir /user/testuser/mahouttest

  3. Set up Mahout to convert the plain text file sample-test.txt into a sequence file that is in the output directory mahouttest:

    mahout seqdirectory --input /user/testuser/sample-test.txt --output /user/ testuser/mahouttest --charset utf-8

loading table of contents...