On-demand Metadata

With the on-demand metadata feature, the Impala coordinators pull metadata as needed from catalogd and cache it locally. The cached metadata gets evicted automatically under memory pressure.

The granularity of on-demand metadata fetches is at the partition level between the coordinator and catalogd. Common use cases like add/drop partitions do not trigger unnecessary serialization/deserialization of large metadata.

The feature can be used in either of the following modes.
Metadata on-demand mode
In this mode, all coordinators use the metadata on-demand.
Set the following on catalogd:
--catalog_topic_mode=minimal
Set the following on all impalad coordinators:
--use_local_catalog=true
Mixed mode
In this mode, only some coordinators are enabled to use the metadata on-demand.
We recommend that you use the mixed mode only for testing local catalog’s impact on heap usage.
Set the following on catalogd:
--catalog_topic_mode=mixed
Set the following on impalad coordinators with metdadata on-demand:
--use_local_catalog=true 
Limitation:

HDFS caching is not supported in On-demand metadata mode coordinators.

INVALIDATE METADATA Usage Notes:

Through "automatic invalidation" or "HMS event polling" support, Impala automatically picks up most changes in metadata from the underlying systems. However there are some scenarios where you might need to run INVALIDATE METADATA or REFRESH.
  • when HMS event polling does not detect changes,
  • if you override manually to disable "HMS event polling",
  • for the list of other cases when a Global INVALIDATE METADATA is recommended.