Troubleshooting replication policies between on-premises clusters

How can replication policy performance be optimized when there are a large number of files to replicate?

You can configure the heap size to 16 GB using the extra Java runtime option. To accomplish this task, perform the following steps:

Go to the source Cloudera Manager > HDFS service > Configuration tab.
Locate the HDFS Replication Environment Advanced Configuration Snippet (Safety Valve) property.
Enter the HADOOP_OPTS="-Xmx16G" key-value pair, and save the changes.
Restart the HDFS service.
Go to the target Cloudera Manager > HDFS service > Configuration tab.
Locate the HDFS Replication Environment Advanced Configuration Snippet (Safety Valve) property.
Enter the HADOOP_OPTS="-Xmx16G" key-value pair, and save the changes.
Restart the HDFS service.

How can file replication tasks be equitably distributed to all mappers?

The Replication Strategy option, that you can configure during the policy creation process, takes care of file replication task distribution. By default, this option is set to Dynamic; that is Replication Manager distributes the file replication tasks in small sets to the mappers, and as each mapper completes its tasks, it dynamically acquires and processes the next unallocated set of tasks.

However, you can configure it to Static. The file replication tasks among the mappers are set upfront to achieve a uniform distribution based on the file sizes.

How to determine the number of mappers and the bandwidth per mapper required for a replication policy?

Mappers in addition to copying files also perform several tasks which include creating directories, preserving permissions and other metadata, calculating checksums, and identifying files to skip for replication. The mappers might also get throttled by the network. The following example describes a typical scenario and ways to resolve issues that might arise.

Example: A replication policy incrementally copies ~100K new/modified files and skips ~10M files every few hours. You can optimize the policy performance for on-premises to on-premises clusters by:

Configuring the mappers based on the requirements using the Maximum Map Slots option during the policy creation process. By default, this option is set to 20.
Choose Skip Checksum Checks during the policy creation process since the number of files that are skipped is high. This ensures that checksum checks are skipped on copied files.
Check the Throughput column for the replication policy on the Replication Policies page for average throughput per mapper/file of all the files written. You can use more mappers with less bandwidth per mapper, if required.
You can configure Maximum Bandwidth to limit the bandwidth per mapper during the policy creation process. By default, this is set to 100 MB.

Why should you consider creating multiple replication policies instead of one replication policy?

You must consider creating multiple replication policies instead of one replication policy to replicate all the directories and files in a cluster because:

the performance improves if you run multiple replication policies at once in parallel.
reliability can be ensured even if a replication policy fails.
you have the flexibility to run the replication policies with less resources and at different intervals.

How many replication policies can be run in parallel?

You can run several replication policies in parallel depending on the following factors:

Number of available mappers
Available network bandwidth
Load on source and target NameNodes
Read bandwidth on source DataNodes and write bandwidth of target DataNodes

It is recommended that you go for the lower side of these limits so that the other applications are also able to access these resources successfully. You can decide the number of concurrent replications depending on the available number of mappers and network bandwidth. For example, if you have a 10 GBps network, you might want to run five replication policies with 20 mappers each in parallel rather than one replication policy with 100 mappers and 100 MBps bandwidth per mapper.

You might want to monitor the write speed on the target cluster if the total bandwidth is more than 100 GBps and you are utilizing all the available bandwidth for the replication policy jobs. This is because the target DataNodes require 3x (or the configured replication factor) write bandwidth for write operations.

Why use the YARN resource pool for replication policy jobs?

Replication Manager uses MapReduce or YARN framework for its replication jobs and the jobs use 20 mappers and a maximum of 100 MB/s network bandwidth utilization by default. You can change this based on the size of the clusters and total data or resources that you want to assign to the replication policy jobs.

It is recommended that you use a YARN resource pool to configure the percentage of resources you want to assign to the replication jobs. This ensures that the replication policy jobs do not consume more than the assigned percentage of resources. You can also configure isolation of resources by specifying which users can use certain resources.

To configure a new YARN resource pool, go to the Cloudera Manager > Clusters > YARN service > Resource Pools (Tab) > Configuration > Create Resource Pool tab.

To use the configured resource pool in a replication policy, go to the Cloudera Manager > Replication Policies > Actions > Edit Configuration > Resources (Tab) > Scheduler Pool field, and enter the YARN resource pool name.

What happens to the replication policies when an active Cloudera Manager instance fails over to the passive Cloudera Manager instance?

During the time duration when Cloudera Manager fails over a passive instance, the previously active Cloudera Manager instance is not up and the local temporary folder on the previously active Cloudera Manager host) used by replication policies becomes inaccessible for the currently active Cloudera Manager instance. Therefore, the replication policies that have a Cloudera Manager peer associated to it (Hive external replication policies and HDFS replication policies between on-premises to on-premises clusters) fail if they are initiated during that time duration. Subsequent runs of the same policy in the absence of a failover event eventually succeed.

To avoid these issues, you can implement the following solutions based on the scenarios:

Controlled or planned Cloudera Manager failover - In this scenario, you can stop or pause existing replication policy job run. You might want to postpone creating any replication policies during the failover time duration.
Unplanned failover - In this scenario, you can use one of the following methods:
- Re-run the failed replication policies.
- Wait for the next planned replication policy run.
- Restore the replicated content to a previous snapshot and re-run the replication policy.

When the HDFS incremental replication fails for an HDFS replication policy, the next policy run starts a full bootstrap replication. How can this issue be mitigated?

When an HDFS replication policy (incremental replication) fails, the last successfully replicated snapshot gets deleted. Therefore, the next policy run starts a full bootstrap replication. For large datasets, the bootstrap replication takes a long time to complete.

To mitigate this issue, set the deleteLatestSourceSnapshotOnJobFailure flag to false using REST APIs for the replication policy. After you set the flag to false, the last replicated snapshot is not deleted even when the replication fails. Therefore, the next policy run is an incremental run.

How to resolve replication policies that fail with the “Custom keytab configuration is required for this service” error?

This error appears for replication policies that use Kerberos-enabled clusters on Isilon storage.

To mitigate this issue, perform the following steps:

Create a custom Kerberos keytab and Kerberos principal that the replication jobs can use to authenticate to storage and other CDP services.
Go to the target Cloudera Manager > Administration > Settings page.
Search for the following properties, and enter the required values:
- Custom Kerberos Keytab Location – Enter the location of the custom Kerberos keytab.
- Custom Kerberos Principal Name – Enter the principal name to use for replication between secure clusters.
For more information about the parameters, see Cloudera Manager Server Properties for replication.
important
To replicate data using replication policies that use Kerberos-enabled clusters on Isilon storage, you must:
- ensure that the source and target clusters have the same set of users and groups. When you set the ownership of files (or when maintaining ownership), if a user or group does not exist, the chown command fails on Isilon. For more information, see Performance and Scalability Limitations.
- enter the Custom Kerberos Principal Name value in the Run As Username field during the replication policy creation process.
Cloudera recommends that you do not select the Replicate Impala Metadata option for Hive/Impala replication policies. To use this feature, create a custom principal in the hdfs/hostname@realm or impala/hostname@realm format.
Add the hadoop.security.token.service.use_ip = false key-value pair to the HDFS Service Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml and Cluster-wide Advanced Configuration Snippet (Safety Valve) for core-site.xml properties.

tip

If the replication MapReduce job fails and the following error appears, set the Isilon cluster-wide time-to-live setting to a higher value on the target cluster:

java.io.IOException: Failed on local exception: java.io.IOException:
  org.apache.hadoop.security.AccessControlException:
  Client cannot authenticate via:[TOKEN, KERBEROS];
  Host Details : local host is: "foo.mycompany.com/172.1.2.3";
  destination host is: "myisilon-1.mycompany.com":8020;

A higher value might cause workloads to be less distributed which might affect the load balancing in the Isilon cluster. To mitigate this issue, you can use a value of 60 as a good starting point. For example, the isi networks modify pool subnet4:nn4 --ttl=60 command configures the Isilon cluster-wide time-to-live setting to 60.

To view the settings for a subnet, you can run the isi networks list pools --subnet subnet3 -v command.