Viewing replication policies

The Replications Policies page displays a row of information about each scheduled replication job. Each row also displays recent messages regarding the last time the Replication job ran.

Figure 1. Replication Policies Table

Only one job corresponding to a replication policy can occur at a time; if another job associated with that same replication policy starts before the previous one has finished, the second one is canceled.

You can limit the replication jobs that are displayed by selecting filters on the left. If you do not see an expected policy, adjust or clear the filters. Use the search box to search the list of policies for path, database, or table names.

The Replication Policies columns are described in the following table.
Table 1. Replication Policies Table
Column Description
ID An internally generated ID number that identifies the policy. Provides a convenient way to identify a policy.

Click the ID column label to sort the replication policy table by ID.

Name The unique name you specify when you create a policy.
Type The type of replication policy, either HDFS or Hive.
Source The source cluster for the replication.
Destination The destination cluster for the replication.
Throughput Average throughput per mapper/file of all the files written. Note that throughput does not include the following information: the combined throughput of all mappers and the time taken to perform a checksum on a file after the file is written.
Progress The progress of the replication.
Completed The time when the replication job completed.

Click the Completed column label to sort the replication policies table by time.

Next Run The date and time when the next replication is scheduled, based on the schedule parameters specified for the policy. Hover over the date to view additional details about the scheduled replication.

Click the Last Run column label to sort the Replication Policies table by the last run date.

Actions The following items are available from the Action button:
  • Show History - Opens the Replication History page for a replication.
  • Edit Configuration - Opens the Edit Replication Policy page.
  • Dry Run - Simulates a run of the replication task but does not actually copy any files or tables. After a Dry Run, you can select Show History, which opens the Replication History page where you can view any error messages and the number and size of files or tables that would be copied in an actual replication.
  • Run Now - Runs the replication task immediately.
  • Click Collect Diagnostic Data to open the Send Diagnostic Data screen, which allows you to collect replication-specific diagnostic data for the last 10 runs of the policy:
    1. Select Send Diagnostic Data to Cloudera to automatically send the bundle to Cloudera Support. You can also enter a ticket number and comments when sending the bundle.
    2. Click Collect and Send Diagnostic Data to generate the bundle and open the Replications Diagnostics Command screen.
    3. When the command finishes, click Download Result Data to download a zip file containing the bundle.
  • Disable | Enable - Disables or enables the replication policy. No further replications are scheduled for disabled replication policies.
  • Delete - Deletes the policy. Deleting a replication policy does not delete copied files or tables.
  • While a job is in progress, the Last Run column displays a spinner and progress bar, and each stage of the replication task is indicated in the message beneath the job's row. Click the Command Details link to view details about the execution of the command.
  • If the job is successful, the number of files copied is indicated. If there have been no changes to a file at the source since the previous job, then that file is not copied. As a result, after the initial job, only a subset of the files may actually be copied, and this is indicated in the success message.
  • If the job fails, the icon displays.
  • To view more information about a completed job, select Actions > Show History.