This is the documentation for CDH 5.1.x. Documentation for other versions is available at Cloudera Documentation.

Mahout Installation

Apache Mahout is a machine-learning tool. By enabling you to build machine-learning libraries that are scalable to "reasonably large" datasets, it aims to make building intelligent applications easier and faster.


To see which version of Mahout is shipping in CDH 5, check the Version and Packaging Information. For important information on new and changed components, see the CDH 5 Release Notes.

The main use cases for Mahout are:

  • Recommendation mining, which tries to identify things users will like on the basis of their past behavior (for example shopping or online-content recommendations)
  • Clustering, which groups similar items (for example, documents on similar topics)
  • Classification, which learns from existing categories what members of each category have in common, and on that basis tries to categorize new items
  • Frequent item-set mining, which takes a set of item-groups (such as terms in a query session, or shopping-cart content) and identifies items that usually appear together

If you have not already done so, install Cloudera's yum, zypper/YaST or apt repository before using the instructions below to install Mahout. For instructions, see Installing CDH 5.

Use the following sections to install, update and use Mahout:
Page generated September 3, 2015.