Automatic metadata management
Impala automatic metadata management feature uses notifications to refresh data after changes.
When tools such as Hive and Spark are used to process raw data
ingested into Hive tables, new Hive metastore metadata (in the form of
databases, tables, and partitions) and file system metadata (in the form of
new files in existing partitions and tables) are generated. In previous
versions of Impala, to pick up this new information, Impala users had to
manually issue an
command. Now, there is a new feature called automatic metadata management.
Automatic metadata management works by using Hive metastore notifications
which performs the following:
- Invalidates tables when it receives an
- Refreshes the partition when it receives an
ALTER TABLE ADD | DROP PARTITIONevent.
- Adds the tables or databases when it receives a
- Removes tables from the Catalog (
catalogd) when it receives a
- Refreshes table and partitions when it receives
When there are database-level changes, the following types of changes are supported:
- Database properties
- Comments on the database
- Owner of the database
- Default location of the database
To control this feature, use the
flag on the
catalogd. When set to a positive value, it
enables Hive metastore event polling. The recommended value to set this
flag to is under 5 seconds.
The automatic metadata management feature does not support files or partitions that have been manually added to HDFS by using Spark, DistCp, or HDFS Put because no Hive metastore notification is generated for these operations. To handle these scenarios, use one of the following methods:
- Use the
LOAD DATAcommand to handle the metadata changes, or
- Run a
REFRESH [db_name.]table_name [PARTITION…command or an
ALTER TABLE table_name RECOVER PARTITIONScommand.
If you need to disable automatic metadata management for certain tables or databases, set the following properties:
CREATE DATABASE <db_name> DBPROPERTIES ('impala.disableHmsSync'='true'); CREATE TABLE <tab_name> WITHTBLPROPERTIES ('impala.disableHmsSync'='true' | 'false');