Configuring Accumulo

Accumulo provides example configurations that you can modify. Copy all files from one of the examples folders in /etc/accumulo/conf/examples to /etc/accumulo/conf.

For example, you would use the following command to copy all files in the /etc/accumulo/conf/examples/512MB/standalone folder to the /etc/accumulo/conf folder:

cp /etc/accumulo/conf/examples/512MB/standalone/* /etc/accumulo/conf

Accumulo has the option to use a native library that manages the memory used for newly written data outside of the Java heap for the Tablet Servers. This allows Accumulo to manage its memory more efficiently, improving performance. Use of the native library should be enabled whenever possible. To build the native library for your system, run the following on each host:

JAVA_HOME=path_to_java_home /usr/hdp/current/accumulo-client/bin/build_native_library.sh

Once this is done, various configuration properties must be changed to use the native maps, with examples appearing in the /etc/accumulo/conf/examples/native-standalone folder.

	Note
	If native maps are not enabled, the examples in the standalone folder should be used instead.

Make an Accumulo data directory:

su - hdfs

hadoop fs -mkdir -p /apps/accumulo

The example configuration files include an accumulo-site.xml file. Add the following property to this file to reference the Accumulo data directory:

	Note
	Change the value of the instance.secret in the `accumulo-site.xml` file, and then change the permissions on the file to 700 to protect the instance.secret from being readable by other users.

<property>
     <name>instance.volumes</name>
     <value>hdfs://namenode:port/apps/accumulo</value>
</property>

For example:

<property>
     <name>instance.volumes</name>
     <value>hdfs://node-1.example.com:8020/apps/accumulo</value>
</property>

Add the configuration property instance.zookeeper.host to the accumulo-site.xml file. The value of this property should be a comma-separated list of ZooKeeper servers.

In this “host” file each non-commented line is expected to be some host which should have a process running on it. The “masters” file contains hosts which should run the Accumulo Master process (only one host will be the active master, the rest will be hot-standbys) and the “slaves” file contains hosts which should run the Accumulo TabletServer process.

For example:

<property>
    <name>instance.zookeeper.host</name>
    <value>server1:2181,server2:2181,server3:2181</value>
<property>

Change permissions to restrict access to the data directory to the Accumulo user:

su - hdfs

hadoop fs -chmod -R 700 /apps/accumulo

Change ownership of the data directory to the Accumulo user and group.

su - hdfs

hadoop fs -chown -R accumlo:accumulo /apps/accumulo

The example configuration files also include an accumulo-env.sh file.

If JAVA_HOME is not defined in the environment, you should specify it by editing the following line of code in the accumulo-env.sh file:
test -z "$JAVA_HOME" && export JAVA_HOME=/path/to/java
If you would like to prevent users from passing JAVA_HOME on the command line, remove the text prior to "export" and add the path to your JAVA_HOME. For example:
export JAVA_HOME=/usr/hadoop-jdk1.7.0_67
If ZOOKEEPER_HOME is not defined in the environment, you should specify it by editing the following line of code in the accumulo-env.sh file:
test -z "$ZOOKEEPER_HOME" && export ZOOKEEPER_HOME=/path/to/zookeeper
If you would like to prevent users from passing ZOOKEEPER_HOME on the command line, remove the text prior to "export" and add the path to your ZOOKEEPER_HOME. For example:
export ZOOKEEPER_HOME=/usr/hdp/current/zookeeper-client/conf

​Configuring Accumulo