Automate partition discovery and repair
Hive automatically and periodically discovers discrepancies in partition metadata in the Hive metastore and corresponding directories on the file system, and then performs synchronization. Automating this operation for log data or data in Spark and Hive catalogs is especially helpful.
The discover.partitions
table property enables and disables
synchronization of the file system with partitions. In external partitioned tables,
this property is enabled (true
) by default when you create the
table using Hive in HDP 3.1.4 and later. To a legacy external table (created using
an earlier version of Hive), add discover.partitions
to the table
properties to enable partition discovery. By default, the discovery and
synchronization of partitions occurs every 5 minutes, but you can configure the
frequency as shown in this task.