This is the documentation for CDH 5.1.x. Documentation for other versions is available at Cloudera Documentation.

Using MapReduce with HBase

To run MapReduce jobs that use HBase, you need to add the HBase and Zookeeper JAR files to the Hadoop Java classpath. You can do this by adding the following statement to each job:

TableMapReduceUtil.addDependencyJars(job);

This distributes the JAR files to the cluster along with your job and adds them to the job's classpath, so that you do not need to edit the MapReduce configuration.

You can find more information about addDependencyJars in the documentation listed under Viewing the HBase Documentation.

When getting an Configuration object for a HBase MapReduce job, instantiate it using the HBaseConfiguration.create() method.

Page generated September 3, 2015.