Abstract

The Cloudera Public Cloud disaster-recovery reference architecture is a high-level design and best practices guide to deploy Cloudera Public Cloud and to implement disaster-recovery use cases.

This document extends the disaster-recovery reference for CDP Private Cloud Base to cover Cloudera Public Cloud disaster-recovery use cases. The document discusses how you can leverage Cloudera tools and/or Cloud providers’ native supportability to achieve your cloud-related disaster-recovery use cases for your workloads.

This document focusses primarily on the following two tiers:

  • Tier 4: Point-in-time recovery - This tier is between one or more separate clusters in different regions. HDFS usually achieves this tier.
  • Tier 5: Two-site commit/transition integrity - In this tier, data is continuously transmitted to primary and alternate backup sites. The cloud storage is often a fit for this tier.

To understand the tiers, see seven tiers of disaster recovery.

The document focuses on HDFS, Hive, and HBase data, metadata, access policies using Ranger, and lineage using Atlas.