Submitting a Python app
A step-by-step procedure shows you how to submit a Python app based on the HiveWarehouseConnector library by submitting an application, and then adding a Python package.
- Choose a read option, for example LLAP, for your application and check that you meet the configuration requirements, described earlier.
- Configure a Spark-HiveServer connection, described earlier or, in your app submission include the appropriate --conf in step 4.
-
Locate the hive-warehouse-connector-assembly jar in the
/hive_warehouse_connector/
directory.For example, findhive-warehouse-connector-assembly-<version>.jar
in the following location:/opt/cloudera/parcels/CDH/jars
-
Add the connector jar and configurations to the app submission using the
--jars
option.Example syntax:pyspark --jars <path to jars>/hive_warehouse_connector/hive-warehouse-connector-assembly-<version>.jar \ --conf <configuration properties>
-
Locate the pyspark_hwc zip package in the
/hive_warehouse_connector/
directory. -
Add the Python package for the connector to the app
submission.
Example syntax:
Example submission in JDBC execution mode:--py-files <path>/hive_warehouse_connector/pyspark_hwc-<version>.zip
pyspark --jars /opt/cloudera/parcels/CDH/jars/hive-warehouse-connector-assembly-<version>.jar --conf spark.sql.extensions=com.hortonworks.spark.sql.rule.Extensions \ --conf spark.datasource.hive.warehouse.read.mode=JDBC_CLUSTER \ --conf spark.datasource.hive.warehouse.load.staging.dir=<path to directory> \ --py-files /opt/cloudera/parcels/CDH/lib/hive_warehouse_connector/pyspark_hwc-<version>.zip