Migrating Hive workloads from CDP Private Cloud Base to CDW Private Cloud

To migrate the Hive workloads to Cloudera Data Warehouse (CDW) Data Service, you must have upgraded from your legacy platform to CDP Private Cloud Base. Learn how to set up your Virtual Warehouse instance and migrate your workloads to Hive LLAP (Low-Latency Analytical Processing) in CDW.

The CDW service provides data warehouses that can be configured and isolated. You scale resources up and down to meet your business demands, and save costs by suspending and resuming resources automatically.

Hive LLAP along with the query isolation feature is best suited for data-intensive queries, such as ETL queries, that require auto-scaling based on the total scan size of the query.

Migrating your Hive workloads to CDW helps you leverage the auto-scaling, workload optimization, isolation, data caching, and many other powerful capabilities that CDW offers.

This document aims to help you understand the process of migrating Hive workloads from CDP Private Cloud Base to CDW and assumes that you have already upgraded to CDP Private Cloud Base. Review the following migration scenarios if you want to migrate to CDW from legacy platforms or CDP:
  • Migrating from HDP to CDW
    If you are migrating from HDP and running workloads using LLAP or Hive on Tez, you must first upgrade to CDP Private Cloud either through an In-Place upgrade or Sidecar migration. You can then follow the Migrating from CDP Private Cloud Base to CDW document to migrate your Hive workloads to CDW.

    To understand how LLAP in HDP is different from LLAP in CDW, see Migrate from HDP (LLAP) to CDW (LLAP).

  • Migrating from CDH to CDW
    If you are migrating from CDH and running workloads using Hive on Map Reduce, you must first upgrade to CDP Private Cloud either through an In-Place upgrade or Sidecar migration. You can then follow the Migrating from CDP PvC Base to CDW document to migrate your Hive workloads to CDW.
  • Migrating from CDP Private Cloud Base to CDW
    If you are migrating from CDP PvC Base and running workloads in the Tez container mode (Hive on Tez), learn how you can migrate to CDW PvC and run Hive workloads on LLAP mode. For more information, see Migrate from CDP Private Cloud (Hive on Tez) to CDW (LLAP).

After you have upgraded to CDP Private Cloud, you can install CDP Private Cloud Data Services and migrate your workloads to CDW. Use the guidance provided in this document to plan your Virtual Warehouse instances based on the workloads that you are running. For more information, see Planning a Virtual Warehouse instance.

You must also understand how Hive queries are processed in CDW using the LLAP execution mode and be aware of the differences between LLAP in CDW and Hive on Tez in CDP. For more information, see Apache Tez processing of Hive jobs.