Viewing replication history

You can view the historical details about replication jobs on the Replication History page.

To view the history of a replication job:

  1. From Cloudera Manager, select Replication > Replication Policies.

    The list of available replication policies appear.

  2. Locate the row for the policy, select the policy, and click Actions. Select Show History.

    The Replication History page appears with the job information.

Figure 1. Replication History Screen (HDFS)

Replication History Table

The Replication History page displays a table of previously run replication jobs with the following columns:

Column Description
Start Time Shows the details about the job.
You can expand the section to view the following job details:
  • Started At - Displays the time the replication job started.
  • Duration - Displays the time duration for the job to complete.
  • Command Details - Displays the command details in a new tab after you click View.

    The Command Details page displays the details and messages about each step during command run. On this page, click Context to view the service status page relevant to the command, and click Download to download the summary as a JSON file.

    To view the command details, expand the Step section and then choose Show All Steps, Show Only Failed Steps, or Show Only Running Steps. In this section, you can perform the following tasks:
    • View the actual command string.
    • View the start time and duration for the command run.
    • View the host status page for the command by clicking the host link.
    • View the full log file for the command by selecting the stdout or stderr tab.

    See Viewing Running and Recent Commands.

  • MapReduce Job. Click the link to view the job details.
  • HDS Replication Report. Click Download CSV to view the following options:
    • Listing - Click to download the CSV file that contains the replication report. The file lists the list of files and directories copied during the replication job.
    • Status - Click to download the CSV file that contains the complete status report. The file contains the full status report of the files where the status of the replication is one of the following:
      • ERROR – An error occurred and the file was not copied.
      • DELETED – A deleted file.
      • SKIPPED – A file where the replication was skipped because it was up-to-date.
    • Error Status Only - Click to download the CSV file that contains the status report of all copied files with errors. The file lists the status, path, and message for the copied files with errors.
    • Deleted Status Only - Click to download the CSV file that contains the status report of all deleted files. The file lists the status, path, and message for the databases and tables that were deleted.
    • Skipped Status Only - Click to download the CSV file that contains the status report of all skipped files. The file lists the status, path, and message for the databases and tables that were skipped.
    • Performance - Click to download a CSV file which contains a summary report about the performance of the running replication job. The performance summary report includes the last performance sample for each mapper that is working on the replication job.
    • Full Performance - Click to download the CSV file that contains the performance report of the job. The performance report shows the samples taken for all the mappers during the full execution of the replication job.
  • (Dry Run only) View the number of Replicable Files. Displays the number of files that would be replicated during an actual replication.
  • (Dry Run only) View the number of Replicable Bytes. Displays the number of bytes that would be replicated during an actual replication.
  • View the number of Impala UDFs replicated. (Displays only for Hive/Impala replications where Replicate Impala Metadata is selected.)
  • If a user was specified in the Run As Username field when creating the replication job, the selected user displays.
  • View messages returned from the replication job.
Duration Time taken for the replication job to complete.
Outcome Indicates the status of the replication job as Successful or Failed.
Files Expected Number of files expected to be copied and its file size based on the parameters of the replication policy.
Files Copied Number of files copied and its file size for the replication job.
Files Failed Number of files that failed to be copied and its file size for the replication job.
Files Deleted Number of files that were deleted and its file size for the replication job
Files Skipped Number of files skipped and its file size for the replication job. The replication process skips files that already exist in the destination and have not changed.