Configure Data Engineering Cloudera Data Hub to use with Cloudera Operational Database

You can configure Hive in your Cloudera Data Engineering Cloudera Data Hub cluster to interact with the Cloudera Operational Database. You can use this integration to create and modify HBase tables using Hive. You can also READ and WRITE to existing HBase tables.

Ensure that you have a Cloudera Operational Database database instance and a Data Engineering template-based Cloudera Data Hub cluster in the same Cloudera environment.
  1. Create a Data Engineering Datahub cluster in the same Cloudera environment on which your Cloudera Operational Database is launched.
  2. Download the HBase client configuration zip archive file from the Cloudera Operational Database client connectivity page. For more information see Client connectivity information. Then, extract hbase-site.xml from the zip file.
  3. On each node in the Data Engineering Cloudera Data Hub cluster that runs HiveServer2, run the mkdir /etc/hbase/cod-conf command, and copy the hbase-site.xml file from Step 2 into /etc/hbase/cod-conf folder.
    Do not copy all the files from client configuration zip archive. Only copy the hbase-site.xml file.
    1. Get the value of ssl.client.truststore.password from the Data Engineering DataHub cluster master ssl-client.xml file (/etc/hadoop/conf.cloudera.hdfs/ssl-client.xml) and update the password for hbase.zookeeper.property.ssl.trustStore.password property in the copied hbase-site.xml file.
    2. Create a symbolic link between /opt/cloudera/parcels/CDH/lib/hbase/conf and /etc/hbase/cod-conf in the Data Engineering Datahub cluster using the following command.
      sudo ln -s /etc/hbase/cod-conf /opt/cloudera/parcels/CDH/lib/hbase/conf
  4. In Cloudera Manager for the Cloudera Data Engineering Cloudera Data Hub cluster, in the Hive-on-Tez service, set the HiveServer2 Environment Advanced Configuration Snippet(hive_hs2_env_safety_valve): HADOOP_CLASSPATH to /etc/hbase/cod-conf.
    1. Click Save Configurations.
    2. Restart the Hive-on-Tez service.
  5. Go to the Ranger service in the Data Lake.
    1. Find HBase Ranger policy cod_[***Cloudera Operational Database DATABASE NAME***]_hbase. For example, if your Cloudera Operational Database database name is CODDB, you must modify the Ranger policy cod_CODDB_hbase.
    2. Add the user hive to the existing policy. For example, all - table, column-family, column.
    3. Wait for a few minutes because Ranger policy sync is not immediate to all resources.
  6. Interact with Hive using the hive command from a node in the Data Engineering Cloudera Data Hub.
    $ hive
    hive> create table test(key int, value string) stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' with serdeproperties ("hbase.columns.mapping" = ":key,f1:val");
    hive> insert into test values(1, 'a');
    hive> select * from test;
    
  7. Validate data is present in HBase as well. You can also run this from the Data Engineering Cloudera Data Hub node: the hbase executable is present to invoke. Otherwise, set up the HBase client tarball from Cloudera Operational Database client connectivity.
    $ HBASE_CONF_DIR=/etc/hbase/cod-conf hbase shell
    hbase> scan 'test'