Configuring Accumulo
Accumulo provides example configurations that you can modify. Copy all files from one of the examples folders in
/etc/accumulo/conf/examples
to/etc/accumulo/conf
.For example, you would use the following command to copy all files in the
/etc/accumulo/conf/examples/512MB/standalone
folder to the/etc/accumulo/conf
folder:cp /etc/accumulo/conf/examples/512MB/standalone/* /etc/accumulo/conf
Accumulo has the option to use a native library that manages the memory used for newly written data outside of the Java heap for the Tablet Servers. This allows Accumulo to manage its memory more efficiently, improving performance. Use of the native library should be enabled whenever possible. To build the native library for your system, run the following on each host:
JAVA_HOME=path_to_java_home /usr/hdp/current/accumulo-client/bin/build_native_library.sh
Once this is done, various configuration properties must be changed to use the native maps, with examples appearing in the
/etc/accumulo/conf/examples/native-standalone
folder.Note If native maps are not enabled, the examples in the standalone folder should be used instead.
Make an Accumulo data directory:
su - hdfs
hadoop fs -mkdir -p /apps/accumulo
The example configuration files include an
accumulo-site.xml
file. Add the following property to this file to reference the Accumulo data directory:Note Change the value of the instance.secret in the
accumulo-site.xml
file, and then change the permissions on the file to 700 to protect the instance.secret from being readable by other users.<property> <name>instance.volumes</name> <value>hdfs://namenode:port/apps/accumulo</value> </property>
For example:
<property> <name>instance.volumes</name> <value>hdfs://node-1.example.com:8020/apps/accumulo</value> </property>
Add the configuration property
instance.zookeeper.host
to theaccumulo-site.xml
file. The value of this property should be a comma-separated list of ZooKeeper servers.In this “host” file each non-commented line is expected to be some host which should have a process running on it. The “masters” file contains hosts which should run the Accumulo Master process (only one host is able to be the active master, the rest are hot-standbys) and the “slaves” file contains hosts which should run the Accumulo TabletServer process.
For example:
<property> <name>instance.zookeeper.host</name> <value>server1:2181,server2:2181,server3:2181</value> <property>
Change permissions to restrict access to the data directory to the Accumulo user:
su - hdfs
hadoop fs -chmod -R 700 /apps/accumulo
Change ownership of the data directory to the Accumulo user and group.
su - hdfs
hadoop fs -chown -R accumlo:accumulo /apps/accumulo
The example configuration files also include an accumulo-env.sh file.
If JAVA_HOME is not defined in the environment, you should specify it by editing the following line of code in the accumulo-env.sh file:
test -z "$JAVA_HOME" && export JAVA_HOME=/path/to/java
If you would like to prevent users from passing JAVA_HOME on the command line, remove the text prior to "export" and add the path to your JAVA_HOME. For example:
export JAVA_HOME=/usr/hadoop-jdk1.7.0_67
If ZOOKEEPER_HOME is not defined in the environment, you should specify it by editing the following line of code in the accumulo-env.sh file:
test -z "$ZOOKEEPER_HOME" && export ZOOKEEPER_HOME=/path/to/zookeeper
If you would like to prevent users from passing ZOOKEEPER_HOME on the command line, remove the text prior to "export" and add the path to your ZOOKEEPER_HOME. For example:
export ZOOKEEPER_HOME=/usr/hdp/current/zookeeper-client/conf