Integrating Apache Hive with Spark and Kafka
Hive Warehouse Connector for accessing Apache Spark data
Set up
HWC limitations
Reading data through HWC
Direct Reader mode introduction
Using Direct Reader mode
Direct Reader configuration properties
Direct Reader limitations
Secure access mode introduction
Setting up secure access mode in Datahub
Using secure access mode
Configuring caching for secure access mode
JDBC read mode introduction
Using JDBC read mode
JDBC mode configuration properties
JDBC mode limitations
Kerberos configurations for HWC
Writing data through HWC
Apache Spark executor task statistics
HWC and DataFrame APIs
HWC and DataFrame API limitations
HWC supported types mapping
Catalog operations
Read and write operations
Committing a transaction for Direct Reader
Closing HiveWarehouseSession operations
Using HWC for streaming
HWC API Examples
Hive Warehouse Connector Interfaces
Submitting a Scala or Java application
Examples of writing data in various file formats
HWC integration pyspark, sparklyr, and Zeppelin
Submitting a Python app
Reading and writing Hive tables in R
Livy interpreter configuration
Reading and writing Hive tables in Zeppelin
Apache Hive-Kafka integration
Creating a table for a Kafka stream
Querying Kafka data
Querying live data from Kafka
Perform ETL by ingesting data from Kafka into Hive
Writing data to Kafka
Writing transformed Hive data to Kafka
Setting consumer and producer table properties
Kafka storage handler and table properties