Migrating Hive data from HDP 2.x or HDP 3.x to CDP One
The recommended way to migrate Hive data from HDP to CDP One depends on the types of tables you are migrating: external, legacy-managed, or ACID (managed) tables.
Scenario 1: Migrating Non-ACID tables (SCHEMA_ONLY + distcp)
- HMS-Mirror: using the SCHEMA_ONLY mode to transfer metadata
- DistCP
If direct Hive connectivity from on-prem to CDP is not available, you can dump of the schema to export it, and then preform a re-run manually in the target cluster.
Scenario 2: Migrating ACID tables (HYBRID + MIGRATE_ACID)
All managed ACID source tables (applicable to HDP) are migrated to ACID tables in CDP
using HMS Mirror in the HYBRID migrate-acid
mode with the
intermediate-storage option. This approach migrates both data and metadata using
Hive queries, and stages the data in an intermediate-storage location on S3. This
approach avoids the need to link the target cloud environment with the
on-prem source.