Migrating Hive tables to Iceberg tables

Using Iceberg tables facilitates multi-cloud open lakehouse implementations. You can move Iceberg-based workloads in Cloudera across deployment environments on AWS and Azure. You can migrate existing external Hive tables from Hive to Iceberg in Cloudera Data Warehouse or from Spark to Iceberg in Cloudera Data Engineering.

Cloudera has chosen Apache Iceberg as the foundation for an open lakehouse in Cloudera. Any compute engine can insert, update, and delete data in Iceberg tables. Any compute engine can read Iceberg tables.

The following Cloudera data services support Iceberg for performing multi-function analytics for the open lakehouse with Cloudera SDX shared security and governance:
  • Cloudera Data Warehouse: Batch ETL, SQL and BI analytic workloads, row-level database operations like updates, deletes, and merge
  • Cloudera Data Engineering: Batch ETL, row-level database operations, table maintenance
  • Cloudera AI: Data Science through Python, R, and other languages, ML model training and inferencing, table maintenance
  • Cloudera DataFlow: Nifi streaming ingestion
  • Cloudera Stream Processing (CSP): Unified streaming ingestion with SQL