Read and write Hive tables in Zeppelin

You can read and write Hive ACID tables from a Spark application using Zeppelin, a browser-based GUI for interactive data exploration, modeling, and visualization.

You must be running spark application and have all the appropriate permissions to read the data from the hive warehouse directory for managed (ACID) tables.
  1. Open the livy configuration file on the livy node and configure HWC Spark Direct Reader mode.
    livy.spark.hadoop.hive.metastore.uris - thrift://
    livy.spark.jars - local:/opt/cloudera/parcels/CDH-7.2.1-1.cdh7.2.1.p0.4847773/jars/hive-warehouse-connector-assembly-
    livy.spark.kryo.registrator - com.qubole.spark.hiveacid.util.HiveAcidKyroRegistrator	- true
    livy.spark.sql.extensions - com.qubole.spark.hiveacid.HiveAcidAutoConvertExtension
    livy.spark.sql.hive.hiveserver2.jdbc.url - jdbc:hive2://;retries=5;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2
    livy.spark.sql.hive.hiveserver2.jdbc.url.principal - hive/_HOST@ROOT.HWX.SITE
  2. In a Zeppelin notebook, read a Hive ACID table.
    sql("show tables").show
    sql("select * from hwc2").show                   
  3. Perform a batch write from Spark.
    val hive = com.hortonworks.hwc.HiveWarehouseSession.session(spark).build()
    val df = Seq(1, "a", 1.1), (2, "b", 2.2), (3, "c", 3.3).toDF("col1", "col2", "col3")
    val tableName = "t3"
    hive.dropTable(tableName, true, true)
  4. Read data from table t3.
    sql("select * from t3").show
    Output is:
    tableName: String = t3
    |   1|   a| 1.1|
    |   2|   b| 2.2|
    |   3|   c| 3.3|