Configuring Iceberg manifest caching in Impala Virtual Warehouse
Apache Iceberg provides a mechanism to cache the contents of Iceberg manifest files in memory. In Cloudera Data Warehouse, you can enable or disable Iceberg manifest caching for Impala Coordinators and Catalogd, and set a few other properties, in your Impala Virtual Warehouse.
The following default properties are set in Cloudera Data Warehouse for manifest caching:
iceberg.io.manifest.cache-enabled=true;
iceberg.io.manifest.cache.max-total-bytes=104857600;
iceberg.io.manifest.cache.expiration-interval-ms=3600000;
iceberg.io.manifest.cache.max-content-length=8388608;
The following list
describes each property: -
iceberg.io.manifest.cache-enabled
: enable/disable the manifest caching feature. -
iceberg.io.manifest.cache.max-total-bytes
: maximum total amount of bytes to cache in the manifest cache. Must be a positive value. -
iceberg.io.manifest.cache.expiration-interval-ms
: maximum duration for which an entry stays in the manifest cache. Must be a non-negative value. Setting zero means cache entries expire only if it gets evicted due to memory pressure fromiceberg.io.manifest.cache.max-total-bytes
. -
iceberg.io.manifest.cache.max-content-length
: maximum length of a manifest file to be considered for caching in bytes. Manifest files with a length exceeding this property value will not be cached. Must be set with a positive value and lower thaniceberg.io.manifest.cache.max-total-bytes
.
Generally, you set a different value for the expiration interval in catalogd and the coordinator. The expiration time is later in catalogd, for example, 1 week. Catalogd needs caching for a longer period of time because the catalogd service serves table metadata.
Changing configuration parameters in Cloudera Data Warehouse is recommended only when following Cloudera instructions.