Replication policy considerations
You should take into consideration certain guidelines when creating or modifying a replication policy. It is important for you to understand the security restrictions and encryption policies within Cloudera Replication Manager.
The guidelines you need to consider before you create or modify a replication policy includes:
Data security
To use an S3 cluster for your policy, register your credentials on the Cloud Credentials page.
A user with access to the Replication Manager user interface has the ability to browse, within the Cloudera Replication Manager UI, the folder structure of any clusters enabled for Cloudera Replication Manager.
Therefore, users can view folders, files, and databases in the Replication Manager user interface, that they might not have access to in HDFS. Users cannot view from the Cloudera Replication Manager UI the content of files on the source or destination clusters. Nor do these administrators have the ability to modify or delete folders or files that are viewable from the Cloudera Replication Manager UI.
Replication Manager policy properties and settings
Consider the recovery point objective (RPO) of the data being replicated when you schedule a replication policy. The RPO is the acceptable lag time between the active site and the replicated data on the destination. Ensure that the frequency is set so that a job finishes before the next job starts.
Jobs based on the same policy cannot overlap. If a job is not completed before another job starts, the second job does not execute and is given the status Skipped. If a job is consistently skipped, you might need to modify the frequency of the job.
Specify bandwidth per map, in MBps. Each map is restricted to consume only the specified bandwidth. This is not always exact. The map throttles back its bandwidth consumption during a copy in such a way that the net bandwidth used tends towards the specified value.
Cluster requirements
-
Pair the clusters before you include them in a replication policy.
-
With a single cluster, you can replicate data on-premises to the cloud.
-
With a single cluster, you cannot replicate data on-premises to on-premises.
-
If the clusters are Replication-Manager enabled, it appears in the Source Cluster or Destination in the Create Policy wizard. You must ensure that the clusters you select are healthy before you start a policy instance (job).