Disaster recovery checklist and resources
A suitable disaster recovery plan consists of multiple items. These include high availability mitigations, backup and recovery, as well as data and workload replication.
A CDP environment consists of several services, components, and data storage mechanisms and dependencies. Some of these live outside the envelope of the cluster, such as the database systems used by services like Hue and Hive. When considering if and how to replicate components within the entire stack, take into account the capabilities of the service. In some cases, you may want to use the native replication capability within the service, such as HBase. In other cases, you may need to use an external tool, such as Replication Manager to coordinate the movement of data between HDFS on your disaster recovery clusters.