Writing to a Secure HBase Cluster

Before you write to a secure HBase cluster, be aware of the following:

  • Flume must be configured to use Kerberos security, and HBase must be configured to use Kerberos security as documented in HBase Security Configuration.
  • The hbase-site.xml file, which must be configured to use Kerberos security, must be in Flume's classpath or HBASE_HOME/conf.
  • HBase2Sink org.apache.flume.sink.hbase2.HBase2Sink supports secure HBase.
  • The Flume HBase sink takes the kerberosPrincipal and kerberosKeytab parameters:
    • kerberosPrincipal – specifies the Kerberos principal to be used
    • kerberosKeytab – specifies the path to the Kerberos keytab
    • These are defined as:
      agent.sinks.hbase2Sink.kerberosPrincipal = flume/fully.qualified.domain.name@YOUR-REALM.COM
      agent.sinks.hbase2Sink.kerberosKeytab = /etc/flume-ng/conf/flume.keytab
    • You can use the $KERBEROS_PRINCIPAL and $KERBEROS_KEYTAB substitution variables to configure the principal name and the keytab file path. See the following documentation for steps on how to configure the substitution variables: Use Substitution Variables for the Kerberos Principal and Keytab.
  • If HBase is running with the AccessController coprocessor, the flume user (or whichever user the agent is running as) must have permissions to write to the same table and the column family that the sink is configured to write to. You can grant permissions using the grant command from HBase shell as explained in HBase Security Configuration.
  • The Flume HBase Sink does not currently support impersonation; it will write to HBase as the user the agent is being run as.
  • If you want to use HDFS Sink and HBase Sink to write to HDFS and HBase from the same agent respectively, both sinks have to use the same principal and keytab. If you want to use different credentials, the sinks have to be on different agents.
  • Each Flume agent machine that writes to HBase (using a configured HBase sink) needs a Kerberos principal of the form:
    flume/fully.qualified.domain.name@YOUR-REALM.COM

    where fully.qualified.domain.name is the fully qualified domain name of the given Flume agent host machine, and YOUR-REALM.COM is the Kerberos realm.