Automate partition discovery and repair
Automated partition discovery and repair is useful for processing log data, and other data, in Spark and Hive catalogs. You learn how to set the partition discovery parameter to suit your use case. An aggressive partition discovery and repair configuration can delay the upgrade process.
Apache Hive can automatically and periodically discover discrepancies in partition metadata in the Hive metastore and in corresponding directories, or objects, on the file system. After discovering discrepancies, Hive performs synchronization.
The discover.partitions
table property
enables and disables synchronization of the file system with partitions. In external
partitioned tables, this property is enabled (true
) by default when
you create the table. To a legacy external table (created using a version of Hive
that does not support this feature), you need to add
discover.partitions
to the table properties to enable partition
discovery.