How can partition metadata replication be improved when the Hive tables use several
Hive partitions?
Hive metadata replication process takes a long time to complete when the Hive
tables use several Hive partitions. This is because the Hive partition
parameters are compared during the import stage of the partition metadata
replication process and if the exported and existing partition parameters do not
match, the partition is dropped and recreated. You can configure a key-value
pair to support partition metadata replication.
-
Go to the tab.
-
Search for the Hive Replication Environment Advanced
Configuration Snippet (Safety Valve) property.
-
Enter the
HIVE_IGNORED_PARTITION_PARAMETERS=[***COMMA
SEPARATED LIST OF HIVE PARTITION
PARAMETERS***] key-value pair.
For example,
HIVE_IGNORED_PARTITION_PARAMETERS=transient_lastDdlTime,totalSize,numRows,COLUMN_STATS_ACCURATE,numFiles
The
partition parameter names you provide are not compared during the
import stage of the partition metadata replication process.
Therefore, even if the partition parameters do not match between the
exported and existing partitions, the partition is not dropped or
recreated. After you configure this key-value pair, the import stage
of the partition metadata replication process completes
faster.
-
Save the changes, and restart the Hive service.