Command Line Installation
Also available as:
PDF
loading table of contents...

Configure Atlas to Use Apache HBase

Prerequisites for Switching to HBase as the Storage Backend for the Graph Repository

  • Supported HBase versions:Currently, only HBase versions 1.1.x are supported

  • To run HBase as a distributed cluster: You must have:

    • 3 or 5 ZooKeeper nodes

    • At least 3 RegionServer nodes. It is ideal to run the DataNodes on the same hosts as the RegionServers for data locality

  • Clear the data in the indexing backend: Before you switch to HBase as your storage backend, you must clear the data in the indexing backend. If you do not, there might be discrepancies between the storage and indexing backends, which can result in errors during searches.

    ElasticSearch runs by default in embedded mode, so you can clear the data by deleting the ATLAS_HOME/data/es directory.

    To clear Solr data, delete the collections that are created during installation of Atlas: vertex_index, edge_index, and fulltext_index. This cleans up the indexing data.

To configure Atlas to use HBase on an unsecured cluster:

  1. In the graph persistence engine section of the application.properties file, set the following properties:

    #Set the backend graph persistence engine to HBase
    atlas.graph.storage.backend=hbase
    
    #For standalone mode, specify 'localhost' as the hostname; for distributed mode, specify the ZooKeeper quorum as the hostname
    atlas.graph.storage.hostname=<localhost|ZooKeeper_quorum>
    [Note]Note

    The value for ZooKeeper_quorum can be retrieved from the hbase.zookeeper.quorum property setting in the hbase-site.xml file. For example:

    <property>
         <name>hbase.zookeeper.quorum</name>
         <value>c6401.ambari.apache.org,c6402.ambari.apache.org,c6403.ambari.apache.org</value>
    </property>
  2. Add HBASE_CONF_DIR to the classpath so the Titan graph database can retrieve the client configuration settings from the hbase-site.xml file:

    export HBASE_CONF_DIR=/etc/hbase/conf
  3. Ensure that hbase-site.xml includes a value for the zookeeper.znode parent property. For example:

    <property>
    <name>zookeeper.znode.parent</name>
    <value>/hbase-unsecure</value>
    </property>
  4. Restart Atlas.

    /usr/hdp/current/atlas-server/bin/atlas_stop.py
    /usr/hdp/current/atlas-server/bin/atlas_start.py

To configure Atlas to use HBase on a cluster secured by Kerberos:

Prerequisite: Setting Permissions

When Atlas is configured to use HBase as the storage backend, the Titan graph database needs sufficient user permissions to create and access an HBase table. In a secure cluster, it might be necessary to grant permissions to the atlas user for the Titan table.

You can use Apache Ranger to configure a policy for the Titan database. See Configuring Resource-Based Services in the Hadoop Security Guide.

If you do not use Apache Ranger, the HBase shell can be used to set the permissions:

su hbase
kinit -k -t <HBase_keytab> <HBase_principal>
echo "grant 'atlas', 'RWXCA', 'titan'" | hbase shell
  1. Create an Atlas JAAS (Java Authentication and Authorization Service) configuration file at the following location:

    /etc/atlas/conf/atlas-jaas.conf

    Add configurations into the atlas-jaas.conf file that point to your keyTab and principal. For example:

    Client {
       com.sun.security.auth.module.Krb5LoginModule required
       useKeyTab=true
       useTicketCache=false
       keyTab="/etc/security/keytabs/atlas.service.keytab"
       principal="atlas/c6401.ambari.apache.org@EXAMPLE.COM";
    };
  2. In the Atlas environment variable METADATA_OPTS, set java.security.auth.login.config to point to the JAAS configuration file that you created in Step 1.

  3. Update the hbase-site.xml file to include the following configurations:

    <property>
         <name>zookeeper.znode.parent</name>
         <value>/hbase-secure</value>
    <property>
                        
    <property>
         <name>hbase.security.authentication</name>
         <value>kerberos</value>
    </property>
    
    <property>
         <name>hbase.security.authorization</name>
         <value>true</value>
    </property>
    
    <property>
         <name>hbase.master.kerberos.principal</name>
         <value>hbase/_HOST@EXAMPLE.COM</value>
    </property>
    
    <property>
         <name>hbase.master.keytab.file</name>
         <value>/etc/security/keytabs/hbase.service.keytab</value>
    </property>
    
    <property>
         <name>hbase.regionserver.kerberos.principal</name>
         <value>hbase/_HOST@EXAMPLE.COM</value>
    </property>
    
    <property>
         <name>hbase.regionserver.keytab.file</name>
         <value>/etc/security/keytabs/hbase.service.keytab</value>
    </property>
  4. Add HBASE_CONF_DIR to the classpath so the Titan graph database can retrieve the client configuration settings from the hbase-site.xml file:

    export HBASE_CONF_DIR=/etc/hbase/conf
  5. Create hbase_master_jaas.conf and hbase_regionserver_jaas.conf configuration files at the following locations:

    /usr/hdp/current/hbase-regionserver/conf/hbase_master_jaas.conf
    /usr/hdp/current/hbase-regionserver/conf/hbase_regionserver_jaas.conf

    Add configurations into these files that point to your keyTab and principal. For example:

    Client {
       com.sun.security.auth.module.Krb5LoginModule required
       useKeyTab=true
       storeKey=true
       useTicketCache=false
       keyTab="/etc/security/keytabs/hbase.service.keytab"
       principal="hbase/c6401.ambari.apache.org@EXAMPLE.COM";
    };
  6. Update the hbase-env.sh script so that the environment variables HBASE_MASTER_OPTS and HBASE_REGIONSERVER_OPTS include java.security.auth.login.config for the jaas.conf files that you created in Step 1 and Step 5.

    For example:

    export
       HBASE_MASTER_OPTS="-Djava.security.auth.login.config=/usr/hdp/current/hbase-regionserver/conf/hbase_master_jaas.conf …"
                        
    export
       HBASE_REGIONSERVER_OPTS="-Djava.security.auth.login.config=/usr/hdp/current/hbase-regionserver/conf/hbase_regionserver_jaas.conf …”
  7. Restart the services.