Learn about the known issues in Iceberg, the impact or changes to the functionality,
and the workaround.
Known issues in Cloudera Runtime 7.3.1
- CDPD-75667: Querying an Iceberg table with a
TIMESTAMP_LTZ
column can result in data loss
- 7.3.1
- 7.3.1.100
- When you query an Iceberg table that has a
TIMESTAMP_LTZ
column, the query could result in data loss.
- When creating Iceberg tables from Spark, set the following
Spark configuration to avoid creating columns with the
TIMESTAMP_LTZ
type:spark.sql.timestampType=TIMESTAMP_NTZ
- Apache JIRA: IMPALA-13484
- CDPD-75411:
SELECT COUNT
query on an Iceberg
table in AWS times out
- 7.3.1, 7.3.1.100, 7.3.1.200
- 7.3.1.300
- In an AWS environment, a
SELECT COUNT
query
that is run on an Iceberg table times out because some 4KB ORC file parts cannot be
downloaded. This issue occurs because Iceberg uses the positional delete index only if
the count of positional deletes are less than a threshold value which is by default,
100000.
- None.
- CDPD-75088: Iceberg tables in azure cannot be partitioned by
strings ending in '.'
- 7.3.1, 7.3.1.100, 7.3.1.200, 7.3.1.300
- In an Azure environment, you cannot create Iceberg tables from
Spark that are partitioned by string columns having a partition value that contains the
period (.) character. The query fails with the following
error:
24/10/08 18:14:12 WARN scheduler.TaskSetManager: [task-result-getter-2]: Lost task 0.0 in stage 2.0 (TID 2) (spark-sfvq0t-compute0.spark-r9.l2ov-m7vs.int.cldr.work executor 1): java.lang.IllegalArgumentException: ABFS does not allow files or directories to end with a dot.
- None.
- CDPD-72942: Unable to read Iceberg table from Hive after writing
data through Apache Flink
- 7.3.1, 7.3.1.100, 7.3.1.200, 7.3.1.300
- If you create an Iceberg table with default values using Hive
and insert data into the table through Apache Flink, you cannot then read the Iceberg
table from Hive using the Beeline client, and the query fails with the following
error:
Error while compiling statement: java.io.IOException: java.io.IOException: Cannot create an instance of InputFormat class org.apache.hadoop.mapred.FileInputFormat as specified in mapredWork!
The
issue persists even after you use the ALTER TABLE statement to set the
engine.hive.enabled
table property to "true".
- None.
- Apache JIRA: HIVE-28525
- CDPD-71962: Hive cannot write to a Spark Iceberg table bucketed
by date column
- 7.3.1, 7.3.1.100, 7.3.1.200, 7.3.1.300
- If you have used Spark to create an Iceberg table that is
bucketed by the "date" column and then try inserting or updating this Iceberg table
using Hive, the query fails with the following
error:
Error: Error while compiling statement: FAILED: RuntimeException org.apache.hadoop.hive.ql.exec.UDFArgumentException: ICEBERG_BUCKET() only takes STRING/CHAR/VARCHAR/BINARY/INT/LONG/DECIMAL/FLOAT/DOUBLE types as first argument, got DATE (state=42000,code=40000)
This
issue does not occur if the Iceberg table is created through Hive.
- None.