Integrating Apache Hive with Apache Spark and BIPDF version

Submit a Scala or Java application

A step-by-step procedure shows you how to submit an app based on the HiveWarehouseConnector library to run on Apache Spark Shell.

  1. Choose an execution mode, for example the HWC JDBC execution mode, for your application and check that you meet the configuration requirements, described earlier.
  2. Configure a Spark-HiveServer connection, described earlier or, in your app submission include the appropriate --conf in step 4.
  3. Locate the hive-warehouse-connector-assembly jar in the /hive_warehouse_connector/ directory.
    For example, find hive-warehouse-connector-assembly-<version>.jar in the following location:
    /opt/cloudera/parcels/CDH/jars  
  4. Add the connector jar and configurations to the app submission using the --jars option.
    Example syntax:
    spark-shell --jars <path to jars>/hive_warehouse_connector/hive-warehouse-connector-assembly-<version>.jar \
    --conf <configuration properties>
  5. Add the path to app you wrote based on the HiveWarehouseConnector API.
    Example syntax:
     <path to app> 
    For example:
    spark-shell --jars /opt/cloudera/parcels/CDH/jars/hive-warehouse-connector-assembly-<version>.jar \
    --conf spark.sql.hive.hwc.execution.mode=spark \
    --conf spark.datasource.hive.warehouse.read.via.llap=false \
    --conf spark.datasource.hive.warehouse.load.staging.dir=<path to directory> \
    /home/myapps/myapp.jar                        
    PySpark and spark-submit are also supported.