Repair partitions using MSCK repair

The MSCK REPAIR TABLE command was designed to manually add partitions that are added to or removed from the file system, such S3, but are not present in the Hive metastore.

This task assumes you created a partitioned external table named emp_part that stores partitions outside the warehouse. You remove one of the partition directories on the file system. This action renders the metastore inconsistent with the file system. You repair the discrepancy manually to synchronize the metastore with the file system, HDFS for example.

  1. Remove the dept=sales object from S3.
  2. In Data Analytics Studio (DAS) or in the Hive shell, look at the emp_part table partitions.
    SHOW PARTITIONS emp_part;
    The list of partitions is stale; it still includes the dept=sales directory.
    +----------------+
    |   partition    |
    +----------------+
    | dept=finance   |
    | dept=sales     |
    | dept=service   |
    +----------------+
  3. Repair the partition manually.
    MSCK REPAIR TABLE emp_part DROP PARTITIONS;