Iceberg-related known issues in CDW Private Cloud

This topic describes the Iceberg-related known issues in Cloudera Data Warehouse (CDW) Private Cloud.

Known issues identified in 1.5.3

No new known issues identified in 1.5.3.

Known issues identified in 1.5.2

DWX-16591: Concurrent merge and update Iceberg queries are failing
You may see that the concurrent merge and update Iceberg queries fail with the following error in the Hive application logs: "Base metadata location hdfs://<Location-A> is not same as the current table metadata location ‘<Location-B>’ for default.merge_insert_target_iceberg\rorg.apache.iceberg.exceptions.CommitFailedException". This happens because the corresponding Query A and Query B have overlapping updates. For example, if Query A commits the data and deletes files first, then Query B will fail with validation failure due to conflicting writes. In this case, Query B should invalidate the commit files that are already generated and re-execute the full query on the latest snapshot.
None.
CDPD-59413: Unable to view Iceberg table metadata in Atlas
You may see the following exception in the Atlas application logs when you create an Iceberg table from the CDW data service associated with a CDP Private Cloud Base 7.1.8 or 7.1.7 SP2 cluster: Type ENTITY with name iceberg_table does not exist. This happens because the Atlas server on CDP Private Cloud Base 7.1.8 and 7.1.7 SP2 does not contain the necessary, compatible functionality to support Iceberg tables. This neither affects creating, querying, or modifying of Iceberg tables using CDW nor does it affect creating of policies in Ranger.

On CDP Private Cloud Base 7.1.9, Iceberg table entities are not created in Atlas. You can ignore the following error appearing in the Atlas application logs: ERROR - [NotificationHookConsumer thread-1:] ~ graph rollback due to exception (GraphTransactionInterceptor:200) org.apache.atlas.exception.AtlasBaseException: invalid relationshipDef: hive_table_storagedesc: end type 1: hive_storagedesc, end type 2: iceberg_table

If you are on CDP Private Cloud Base 7.1.7 SP2 or 7.1.8, then you can manually upload the Iceberg model file z1130-iceberg_table_model.json in to the /opt/cloudera/parcels/CDH/lib/atlas/models/1000-Hadoop directory as follows:
  1. SSH into the Atlas server host as an Administrator.
  2. Change directory to the following:
    cd /opt/cloudera/parcels/CDH/lib/atlas/models/1000-Hadoop
  3. Create a file called 1130-iceberg_table_model.json with the following content:
    {
      "enumDefs": [],
      "structDefs": [],
      "classificationDefs": [],
      "entityDefs": [
        {
          "name": "iceberg_table",
          "superTypes": [
            "hive_table"
          ],
          "serviceType": "hive",
          "typeVersion": "1.0",
          "attributeDefs": [
            {
              "name": "partitionSpec",
              "typeName": "array<string>",
              "cardinality": "SET",
              "isIndexable": false,
              "isOptional": true,
              "isUnique": false
            }
          ]
        },
        {
          "name": "iceberg_column",
          "superTypes": [
            "hive_column"
          ],
          "serviceType": "hive",
          "typeVersion": "1.0"
        }
      ],
      "relationshipDefs": [
        {
          "name": "iceberg_table_columns",
          "serviceType": "hive",
          "typeVersion": "1.0",
          "relationshipCategory": "COMPOSITION",
          "relationshipLabel": "__iceberg_table.columns",
          "endDef1": {
            "type": "iceberg_table",
            "name": "columns",
            "isContainer": true,
            "cardinality": "SET",
            "isLegacyAttribute": true
          },
          "endDef2": {
            "type": "iceberg_column",
            "name": "table",
            "isContainer": false,
            "cardinality": "SINGLE",
            "isLegacyAttribute": true
          },
          "propagateTags": "NONE"
        }
      ]
    }
  4. Save the file and exit.
  5. Restart the Atlas service using Cloudera Manager.