Configure Atlas to Use HBase
Prerequisites for Switching to HBase as the Storage Backend for the Graph Repository
Supported HBase versions: Currently, only HBase versions 1.1.x are supported
To run HBase as a distributed cluster: You must have:
3 or 5 ZooKeeper nodes
At least 3 RegionServer nodes. It is ideal to run the DataNodes on the same hosts as the RegionServers for data locality
Clear the data in the indexing backend: Before you switch to HBase as your storage backend, you must clear the data in the indexing backend. If you do not, there might be discrepancies between the storage and indexing backends, which can result in errors during searches.
ElasticSearch runs by default in embedded mode, so you can clear the data by deleting the
ATLAS_HOME/data/es
directory.To clear Solr data, delete the collections that are created during installation of Atlas:
vertex_index
,edge_index
, andfulltext_index
. This will clean up the indexing data.
To configure Atlas to use HBase on an unsecured cluster:
In the graph persistence engine section of the
application.properties
file, set the following properties:#Set the backend graph persistence engine to HBase atlas.graph.storage.backend=hbase #For standalone mode, specify 'localhost' as the hostname; for distributed mode, specify the ZooKeeper quorum as the hostname atlas.graph.storage.hostname=<localhost|ZooKeeper_quorum>
Note The value for
ZooKeeper_quorum
can be retrieved from thehbase.zookeeper.quorum
property setting in thehbase-site.xml
file. For example:<property> <name>hbase.zookeeper.quorum</name> <value>c6401.ambari.apache.org,c6402.ambari.apache.org,c6403.ambari.apache.org</value> </property>
Add
HBASE_CONF_DIR
to the classpath so the Titan graph database can retrieve the client configuration settings from thehbase-site.xml
file:export HBASE_CONF_DIR=/etc/hbase/conf
Ensure that
hbase-site.xml
includes a value for thezookeeper.znode
parent property. For example:<property> <name>zookeeper.znode.parent</name> <value>/hbase-unsecure</value> </property>
Restart Atlas.
/usr/hdp/current/atlas-server/bin/atlas_stop.py /usr/hdp/current/atlas-server/bin/atlas_start.py
To configure Atlas to use HBase on a cluster secured by Kerberos:
Prerequisite: Setting Permissions
When Atlas is configured to use HBase as the storage backend, the Titan graph database
needs sufficient user permissions to create and access an HBase table. In a secure cluster, it might be
necessary to grant permissions to the atlas
user for the Titan table.
You can use Apache Ranger to configure a policy for the Titan database. See Create an HDFS Policy in the Hadoop Security Guide.
Make a note of the name you gave to this repository; you will need to use it again during HDFS plug-in setup.
If you do not use Apache Ranger, the HBase shell can be used to set the permissions:
su hbase kinit -k -t <HBase_keytab> <HBase_principal> echo "grant 'atlas', 'RWXCA', 'titan'" | hbase shell
Create an Atlas JAAS (Java Authentication and Authorization Service) configuration file at the following location:
/etc/atlas/conf/atlas-jaas.conf
Add configurations into the
atlas-jaas.conf
file that point to your keyTab and principal. For example:Client { com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true useTicketCache=false keyTab="/etc/security/keytabs/atlas.service.keytab" principal="atlas/c6401.ambari.apache.org@EXAMPLE.COM"; };
In the Atlas environment variable
METADATA_OPTS
, setjava.security.auth.login.config
to point to the JAAS configuration file that you created in Step 1.Update the
hbase-site.xml
file to include the following configurations:<property> <name>zookeeper.znode.parent</name> <value>/hbase-secure</value> <property> <property> <name>hbase.security.authentication</name> <value>kerberos</value> </property> <property> <name>hbase.security.authorization</name> <value>true</value> </property> <property> <name>hbase.master.kerberos.principal</name> <value>hbase/_HOST@EXAMPLE.COM</value> </property> <property> <name>hbase.master.keytab.file</name> <value>/etc/security/keytabs/hbase.service.keytab</value> </property> <property> <name>hbase.regionserver.kerberos.principal</name> <value>hbase/_HOST@EXAMPLE.COM</value> </property> <property> <name>hbase.regionserver.keytab.file</name> <value>/etc/security/keytabs/hbase.service.keytab</value> </property>
Add
HBASE_CONF_DIR
to the classpath so the Titan graph database can retrieve the client configuration settings from thehbase-site.xml
file:export HBASE_CONF_DIR=/etc/hbase/conf
Create
hbase_master_jaas.conf
andhbase_regionserver_jaas.conf
configuration files at the following locations:/usr/hdp/current/hbase-regionserver/conf/hbase_master_jaas.conf /usr/hdp/current/hbase-regionserver/conf/hbase_regionserver_jaas.conf
Add configurations into these files that point to your keyTab and principal. For example:
Client { com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true storeKey=true useTicketCache=false keyTab="/etc/security/keytabs/hbase.service.keytab" principal="hbase/c6401.ambari.apache.org@EXAMPLE.COM"; };
Update the
hbase-env.sh
script so that the environment variablesHBASE_MASTER_OPTS
andHBASE_REGIONSERVER_OPTS
includejava.security.auth.login.config
for thejaas.conf
files that you created in Step 1 and Step 5.For example:
export HBASE_MASTER_OPTS="-Djava.security.auth.login.config=/usr/hdp/current/hbase-regionserver/conf/hbase_master_jaas.conf …" export HBASE_REGIONSERVER_OPTS="-Djava.security.auth.login.config=/usr/hdp/current/hbase-regionserver/conf/hbase_regionserver_jaas.conf …”
Restart the services.