Reading and writing Hive tables in Zeppelin

You can read and write Hive ACID tables from a Spark application using Zeppelin, a browser-based GUI for interactive data exploration, modeling, and visualization.

You must be running spark application and have all the appropriate permissions to read the data from the hive warehouse directory for managed (ACID) tables.
  1. Open the livy configuration file on the livy node and configure HWC Spark Direct Reader mode.
    livy.spark.hadoop.hive.metastore.uris thrift://
    livy.spark.jars local:/opt/cloudera/parcels/CDH-7.2.1-1.cdh7.2.1.p0.4847773/jars/hive-warehouse-connector-assembly-<version>.jar
    livy.spark.kryo.registrator com.qubole.spark.hiveacid.util.HiveAcidKyroRegistrator true
    livy.spark.sql.extensions com.hortonworks.spark.sql.rule.Extensions
    livy.spark.sql.hive.hiveserver2.jdbc.url jdbc:hive2://;retries=5;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2
    livy.spark.sql.hive.hiveserver2.jdbc.url.principal hive/_HOST@ROOT.HWX.SITE
  2. In a Zeppelin notebook, read a Hive ACID table.
    sql("show tables").show
    sql("select * from hwc2").show                   
  3. Perform a batch write from Spark.
    val hive = com.hortonworks.hwc.HiveWarehouseSession.session(spark).build()
    val df = Seq(
            (1, "a", 1.1),
            (2, "b", 2.2),
            (3, "c", 3.3)
        ).toDF("col1", "col2", "col3")
    val tableName = "t3"
    hive.dropTable(tableName, true, true)
    df.write.format("com.hortonworks.spark.sql.hive.llap.HiveWarehouseConnector").mode("append").option("table", tableName).save()
  4. Read data from table t3.
    sql("select * from t3").show
    Output is:
    tableName: String = t3
    |   1|   a| 1.1|
    |   2|   b| 2.2|
    |   3|   c| 3.3|