You see how to update Iceberg table partitioning in an existing table and then how to
change the partitioning to be more granular.
Partition information is stored logically, and only in table metadata. When you
update a partition spec, the old data written with an earlier spec remains unchanged.
New data is written using the new spec in a new layout. Metadata for each of the
partition versions is separate.
-
Create a table partitioned by year.
Hive
CREATE EXTERNAL TABLE ice_t (i int, j int, ts timestamp)
PARTITIONED BY SPEC (truncate(5, j), year(ts))
STORED BY ICEBERG;
Impala:
CREATE TABLE ice_t (i int, j int, ts timestmap)
PARTITIONED BY SPEC (truncate(5, j), year(ts))
STORED BY ICEBERG;
-
Split the data into manageable files using buckets.
ALTER TABLE ice_t SET PARTITION SPEC (bucket(13, i));
-
Partition the table by month.
ALTER TABLE ice_t SET PARTITION SPEC (truncate(5, j), month(ts));