Hive replication policy introduction

You can set up the Hive replication policies to copy managed/transactional tables and external table metadata between CDP Private Cloud Base clusters for backup, load balancing, and other purposes.

You can set up the Hive replication policies to perform the following operations:
  • Replicate ACID tables
  • Perform incremental replication based on metastore events
  • Replicate External Table Metadata Replication

    You must set up Cloudera Backup and Disaster Recovery (BDR) HDFS policies separately to replicate external table metadata.

If you want to replicate external table data, or contents of a directory that has a mixture of external data and ACID data, see Replication Manager documentation.

The Hive replication uses the Hive scheduler to schedule the replication policies.

Prerequisites

To replicate data and metadata between CDP Private Cloud Base clusters using Hive replication, you need to meet the following prerequisites:
  • Lowest Supported Cloudera Manager version - 7.3.1
  • Lowest Supported CDP Private Cloud Base version - 7.1.6
  • Configure a trust between the CDP Private Cloud Base clusters before creating a Hive replication policy.
  • Configure the hive.repl.cm.enabled=true property on the source cluster to turn on the ChangeManager.

Limitations

Replication to CDP Public Cloud (AWS, Azure, GCP) is not supported.