Writing data through HWC
A step-by-step procedure walks you through connecting to HiveServer (HS2) to perform batch writes from Spark, which is recommended for production. You configure HWC for the managed table write, launch the Spark session, and write ACID, managed tables to Apache Hive.
- Accept the default
spark.datasource.hive.warehouse.load.staging.dir
for the temporary staging location required by HWC. - Check that
spark.hadoop.hive.zookeeper.quorum
is configured. - Set Kerberos configurations for HWC, or for an unsecured cluster, set
spark.security.credentials.hiveserver2.enabled
=false
.
spark.datasource.hive.warehouse.load.staging.dir
) from
Spark, followed by executing a "LOAD DATA" query in hive via JDBC. Exception: writing to
dynamic partitions creates an intermediate temporary external table.Using HWC to write data is recommended for production in CDP.