Automate partition discovery and repair
Hive automatically and periodically discovers discrepancies in partition metadata in the Hive metastore and corresponding directories or objects on the file system, and then performs synchronization. This operation is often automated for processing log data or data in Spark and Hive catalogs.
discover.partitions table property enables and disables
synchronization of the file system with partitions. In external partitioned tables, this
property is enabled (
true) by default when you create the table. To a legacy external table (created using a version of Hive that does not support this feature), you need to add
discover.partitions to the table properties to enable
By default, the discovery and synchronization of partitions occurs every 5 minutes, but you can configure the frequency as shown in this task.
Assuming you have an external table created using a version of Hive that does not support partition discovery, enable partition discovery for the table.
ALTER TABLE exttbl SET TBLPROPERTIES ('discover.partitions' = 'true');
Set synchronization of partitions to occur every 10 minutes expressed in