HWC changes from HDP to CDP
You need to understand the Hive Warehouse Connector (HWC) changes from HDP to CDP. Extensive HWC documentation can prepare you to update your HWC code to run on CDP. In CDP, methods and the configuration of the HWC connections differ from HDP.
The HWC interface is simplified in CDP resulting in the convergence of the
executeQuery methods to the
sql method. The
methods are deprecated and will be removed from CDP in a future release. Historical calls to
executeQuery used the JDBC connection and were
limited to 1000 records. The 1000 record limitation does not apply to the
sqlmethod, although using JDBC cluster mode is recommended only for
production for workloads having a data size of 1GB or less. Larger workloads are not
recommended for JDBC reads in production due to slow performance.
Although the old methods are still supported in CDP for backward compatibility, refactoring
your code to use the
sql method for all configurations (JDBC client, Direct
Reader V1 or V2, and Secure Access modes) is highly recommended.
Recommended method refactoring
|API||From HDP||To CDP||HDP Example||CDP Example|
|HWC sql API||
|Spark sql API||
Deprecated and changed configurations
HWC read configuration is simplified in CDP. You use a common configuration for Spark Direct Reader, JDBC Cluster, or Secure Access mode.
Recommended configuration refactoring
Refactor configuration code to remove unsupported configurations. Use the following common
You can transparently read data from Spark with HWC in different modes using just
Secured cluster configurations
The jdbc url must not contain the jdbc url principal and must be passed as shown here.
- Catalog browsing
- JDBC client mode configuration