Manage, monitor, and troubleshoot Atlas replication policies

After you create a replication policy, you can run the replication job, disable or delete the job, edit the policy configuration, or view the replication job history in Replication Manager.

Replication policy details on the Replication Policies page

On the Cloudera Manager > Replication > Replication Policies page, you can view the following details about the replication policy:

  • Shows a row of information for each replication policy, and the following columns for each replication policy:

    • Internally generated ID for the replication policy. Click the column label to sort the replication policies.
    • Replication policy Name that you provide during replication policy creation.
    • Replication policy Type.
    • Source cluster in the replication policy.
    • Destination cluster in the replication policy.
    • Average Throughput per mapper/file for all the files written.
    • Replication job Progress.
    • Timestamp when the replication job Completed.
    • Replication policy job’s Next Run.
  • Provides the following options under the Actions menu:

    • Show History opens the Replication History page for the replication policy.
    • Edit Configuration enables you to change the replication policy options.
    • Dry Run simulates a run of the replication job where no files or tables are copied. After the dry run completes, select Actions > Show History to view the potential error messages. The number and size of files or tables that are copied in an actual replication appears on the Replication History page.
    • Run Now initiates a replication job.
    • Collect Diagnostic Data opens the Send Diagnostic Data dialog box where you can:
      • Collect Diagnostic Data for the last 10 runs of the replication policy, and Download it as a ZIP file to your machine.
      • Select Send Diagnostic Data to Cloudera (optionally, add a Cloudera support ticket number and comments) and click Collect Diagnostic Data to automatically send the bundle to Cloudera Support for further assistance.
    • Disable an active replication policy. You can Enable it later, as necessary.
  • Delete the replication policy permanently. Deleting a replication policy does not delete copied files or tables.

Replication History page

Click Actions > Show History for a replication policy on the Replication Policies page to view the Replication History page.

On the Replication History page, you can view the following run details about a replication policy job:

  • Shows the replication policy Name; replication policy Type; Source cluster name; Destination cluster name; and Next Run of the replication policy.

  • Shows a row of information for each replication policy job run, and the following columns for each replication policy:
    Column Description
    Start Time Shows the timestamp when the replication job started.
    Duration Shows the time taken to complete the replication job.
    Outcome Shows the replication job status as Running, Successful, or Failed.
    Atlas Entities Replicated

    Shows the number of tables for which the Atlas metadata and lineage is being replicated.

    Export Status Shows the current status as Running, Successful, or Failed of the export process of Atlas metadata and data lineage from the source cluster to the staging directory on the target cluster.
    Import Status Shows the current status as Running, Successful, or Failed of the import process of the Atlas metadata and data lineage into the required directory on the target cluster.
  • Expand a job to view the following information on the All Recent Commands window:
    • Status of the replication job.
    • Atlas Replication in the Context field opens the Clusters > Atlas Replication window where more details about the replication policy job appears.
    • Replication job Started At timestamp.
    • Duration to complete the job.
    • Download the results to your machine.
    • Expand to Show All Steps, Show Only Failed Steps, or Show Only Running Steps for the commands used by Atlas replication policy.
    • Show Command Timing shows the timeline for the commands used by the Atlas replication policy.

Error appears during Atlas replication policy run

You can diagnose the errors that appear after you initiate an Atlas replication policy run or during the replication policy run.

Solution

  • If the error appears after you initiate the Atlas replication policy run, you can perform the following steps to diagnose the error:
    1. Go to the target Cloudera Manager > Replication > Replication Policies page.
    2. Click Actions > Show History for the required Atlas replication policy.
    3. Expand the section in the Start Time column.
    4. Click Command Details to view the stdout and stderr tab to diagnose the error.
    5. Open the cloudera-scm-server.log file located in the /var/log/cloudera-scm-server/ location if you require more details to diagnose the issue.
  • If the error appears during the Atlas replication policy run, you can perform the following steps to diagnose the error:
    1. Go to the Cloudera Manager > Running Commands page.
    2. Click the Atlas Server link on the given node.
    3. Open the application.log file or Role logs to diagnose the error.