Iceberg

You must be aware of the known issues and limitations, the areas of impact, and workaround for Iceberg in 7.3.1.100.

CDPD-78134: CBO fails when a materialized view is dropped but its pre-compiled plan remains in the registry.

Consider a cluster having two HiveServer (HS2) instances. Each HS2 instance contains its own Materialized View (MV) registry and the registries contain pre-complied plans of MVs that are enabled for query rewriting. Without the registries, MVs will have to be loaded and compiled during each query compilation, resulting in slow query performance.

When MVs are created or dropped, they are added to or removed from the registry pertaining to the HS2 instance that issues the create or drop statement. The other HS2 instance is not immediately notified of the change. A background process is scheduled to refresh the registry, however, this process does not handle the removal of dropped MVs.

When an MV is dropped by one of the HS2 instances, it remains in the registry of the other HS2 instance. Now, if a query is processed in the second HS2 instance, the rewrite algorithm still attempts to use the dropped MV. If this MV is stored in an Iceberg table, the storage handler tries to refresh the MV metadata from the metastore but throws an exception because the MV no longer exists, resulting in a CBO failure.

Perform one of the following workarounds to address the issue:

Restart all the HS2 instances after dropping the MV.
From Cloudera Manager, go to Clusters > Hive > Configuration and add the hive.server2.materializedviews.registry.impl=DUMMY property in the HiveServer2 Advanced Configuration Snippet (Safety Valve) for hive-site.xml. The DUMMY value indicates that MVs should not be cached and requests should be forwarded to Hive Metastore.
note
Setting this property to DUMMY was done for testing purposes and can greatly increase the query compilation time.

Apache JIRA: HIVE-28773

CDPD-78381: Performance degradation noticed in some Hive Iceberg TPC-DS queries

While running Hive TPC-DS (Parquet + Iceberg) performance benchmarking for Cloudera Runtime 7.3.1.100, the overall performance of Iceberg tables resulted in a 15.68% increase as compared to Iceberg tables in Cloudera Runtime 7.3.1.0. However, it was noticed that some of the queries resulted in a decreased performance.

None.