Iceberg data types
References include Iceberg data types and a table of equivalent SQL data types by Hive/Impala SQL engine types.
Iceberg supported data types
Iceberg data type | SQL data type | Hive | Impala |
---|---|---|---|
binary | BINARY | BINARY | |
boolean | BOOLEAN | BOOLEAN | BOOLEAN |
date | DATE | DATE | DATE |
decimal(P, S) | DECIMAL(P, S) | DECIMAL (P, S) | DECIMAL (P, S) |
double | DOUBLE | DOUBLE | DOUBLE |
fixed(L) | BINARY | Not supported | |
float | FLOAT | FLOAT | FLOAT |
int | TINYINT, SMALLINT, INT | INTEGER | INTEGER |
list | ARRAY | ARRAY | Read only |
long | BIGINT | BIGINT | BIGINT |
map | MAP | MAP | Read only |
string | VARCHAR, CHAR | STRING | STRING |
struct | STRUCT | STRUCT | Read only |
time | STRING | Not supported | |
timestamp | TIMESTAMP | TIMESTAMP | TIMESTAMP (see limitation below) |
timestamptz | TIMESTAMP WITH LOCAL TIME ZONE | Use TIMESTAMP WITH LOCAL TIMEZONE for handling these in queries |
Read timestamptz into TIMESTAMP values Writing not supported |
uuid | none |
STRING Writing to Parquet is not supported |
Not supported |
Data type limitations
An implicit conversion to an Iceberg type occurs only if there is an exact match; otherwise, a cast is needed. For example, to insert a VARCHAR(N) column into an Iceberg table you need a cast to the VARCHAR type as Iceberg does not support the VARCHAR(N) type. To insert a SMALLINT or TINYINT into an Iceberg table, you need a cast to the INT type as Iceberg does not support these types.
- timestamp (without timezone)
- timestamptz ( with timezone)
- spark.sql.iceberg.handle-timestamp-without-timezone
- spark.sql.iceberg.use-timestamp-without-timezone-in-new-tables
Configure these properties only on Spark 3.3 and earlier.
Spark still handles the timestamp column as a timestamp with local timezone. Inconsistent results occur unless Spark is running in UTC.
Unsupported data types
-
TIMESTAMPTZ (only read support)
- TIMESTAMP in tables in AVRO format
-
FIXED
-
UUID