Hive warehouse directory
Hive stores table files by default in the Hive Warehouse directory.
Hive 1 and 2
The warehouse directory path is /user/hive/warehouse (CDH) or /apps/hive/warehouse (HDP)
Hive 3
hte warehouse directory path for a managed table is /warehouse/tablespace/managed/hive and the default location for external table is /warehouse/tablespace/external/hive.
Action Required
As ACID tables reside by default in /warehouse/tablespace/managed/hive and only the Hive service can own and interact with files in this directory, applications accessing the managed tables directly need to change their behavior in Hive3.
As an example, consider the changes required for Spark application. As shown in the table
below, if during upgrade Hive1/2 managed tables gets converted to an external table in Hive3, no
refactoring is required for Spark application. If the Hive1/2 managed table gets converted to the
managed table in Hive3, Spark application needs to refactor to use Hive Warehouse Connector (HWC)
or the HWC Spark Direct Reader.
Hive Table before upgrade in Hive1/2 | Hive Table after upgrade in Hive3 | Spark Refactoring |
---|---|---|
Managed | External | None |
Managed | Managed | Use HWC or HWC Spark Direct Reader |
External | External | None |
Distribution Affected
CDH5, CDH6, HDP2