Iceberg table properties

The CDP environment for querying tables from Hive overrides some Iceberg table properties. You learn which table properties are supported for querying tables from Impala.

Iceberg documentation describes all the properties for configuring tables. This documentation focuses on key properties for working with Iceberg tables in CDP.

Iceberg supports concurrent writes by default. You can tune Iceberg v2 table properties for concurrent writes. You set the following properties if you plan to have concurrent writers on Iceberg v2 tables:

  • commit.retry.min-wait-ms

  • commit.retry.num-retries

CDP supports adding the Parquet compression type using table properties. For more information, see Iceberg documentation about Compression Types.

You can use the Alter Table feature to set a property. From Hive, the following Iceberg table property overrides are in effect:
  • iceberg.mr.split.size overrides read.split.target-size.
  • read.split.open-file-cost is overridden.
You can tune Iceberg v2 table properties for concurrent writes. From Impala, the following subset of Iceberg table properties are supported:
  • history.expire.min-snapshots-to-keep

    Valid values: integers. Default = 1

  • write.format.default

    Valid value: Parquet

  • write.metadata.delete-after-commit.enabled

    Valid values: true or false.

  • write.metadata.previous-versions-max

    Valid values: integers. Default = 100.

  • write.parquet.compression-codec

    Valid values: GZIP, LZ4, NONE, SNAPPY (default value), ZSTD

  • write.parquet.compression-level

    Valid values: 1 - 22. Default = 3

  • write.parquet.row-group-size-bytes

    Valid values: 8388608 (or 8 MB) - 2146435072 (or 2047MB). Overiden by PARQUET_FILE_SIZE.

  • write.parquet.page-size-bytes

    Valid values: 65536 (or 64KB) - 1073741824 (or 1GB).

  • write.parquet.dict-size-bytes

    Valid values: 65536 (or 64KB) - 1073741824 (or 1GB)