Data Migration Versus Upgrade
Recommendations on whether to upgrade to Cloudera Base on premises or migrate workloads to Cloudera on cloud.
- Data migration refers to moving existing CDH or HDP workloads to Cloudera on cloud or to a new installation of Cloudera Base on premises.
- Upgrade refers to a full in-place upgrade of CDH or HDP to Cloudera Base on premises.
The path to Cloudera that works best for you depends on the size of your clusters, the types of workloads you are running, and whether you want to move workloads to the Cloud, stay exclusively on-prem, or use a combination of on-prem and cloud.
On-prem cluster less than 50 hosts with Hive or Impala
If you are running Hive or Impala workloads without HBase on an on-prem cluster with less than 50 hosts, and less than 5 services running on the cluster:
-
Migrate workloads to Cloudera Data Warehouse on Cloudera.
On-prem cluster less than 50 hosts with HBase
If you are running HBase workloads without Hive or Impala on an on-prem cluster with less than 50 hosts, and less than 5 services running on the cluster:
-
Migrate workloads to Cloudera Data Hub on Cloudera on cloud and use the Cloudera Operational Database cluster template.
On-prem cluster less than 50 hosts with Spark
If you are running Spark workloads without Kafka, NiFi, or Storm on an on-prem cluster with less than 50 hosts, and less than 5 services running on the cluster:
-
Migrate workloads to Cloudera Data Hub on Cloudera on cloud and use the Cloudera Data Engineering cluster template.
On-prem cluster more than 800 hosts
If you are running workloads on an on-prem cluster with more than 800 hosts:
-
Split the cluster up into multiple 100-300 node clusters and upgrade to Cloudera Base on premises.
Multiple on-prem clusters with more than 100 hosts
If you are running workloads on multiple on-prem clusters with a combined total of more than 100 hosts, and less than 50 services in total:
-
Consolidate the clusters into one 100-300 node cluster and upgrade to Cloudera Base on premises.