Submitting a Python app

A step-by-step procedure shows you how to submit a Python app based on the HiveWarehouseConnector library by submitting an application, and then adding a Python package.

  1. Choose a read option, for example LLAP, for your application and check that you meet the configuration requirements, described earlier.
  2. Configure a Spark-HiveServer connection, described earlier or, in your app submission include the appropriate --conf in step 4.
  3. Locate the hive-warehouse-connector-assembly jar in the /hive_warehouse_connector/ directory.
    For example, find hive-warehouse-connector-assembly-<version>.jar in the following location:
    /opt/cloudera/parcels/CDH/jars  
  4. Add the connector jar and configurations to the app submission using the --jars option.
    Example syntax:
    pyspark --jars <path to jars>/hive_warehouse_connector/hive-warehouse-connector-assembly-<version>.jar \
    --conf <configuration properties>
  5. Locate the pyspark_hwc zip package in the /hive_warehouse_connector/ directory.
  6. Add the Python package for the connector to the app submission.
    Example syntax:
    --py-files <path>/hive_warehouse_connector/pyspark_hwc-<version>.zip
    Example submission in JDBC execution mode:
    pyspark --jars /opt/cloudera/parcels/CDH/jars/hive-warehouse-connector-assembly-<version>.jar
    --conf spark.sql.extensions=com.hortonworks.spark.sql.rule.Extensions \
    --conf spark.datasource.hive.warehouse.read.mode=JDBC_CLUSTER \
    --conf spark.datasource.hive.warehouse.load.staging.dir=<path to directory> \
    --py-files /opt/cloudera/parcels/CDH/lib/hive_warehouse_connector/pyspark_hwc-<version>.zip