View historical details for an HDFS replication policy

You can view the historical details about the replication jobs on the Replication History page.

The following table lists the columns that appear on the Replication History page when you click Actions > Show History to view the previously run replication jobs:

Column Description
Start Time Shows the job details.
Expand the section to view the following job details:
  • Started At timestamp is when the replication job started.
  • Duration to complete the job.
  • Command Details appear in a new tab after you click View.

    The Command Details page shows the details and messages about each step during the command run. Click Context to view the service status page relevant to the command, and click Download to download the summary as a JSON file.

    Expand Step to choose Show All Steps, Show Only Failed Steps, or Show Only Running Steps. You can perform the following tasks in this section:
    • View the actual command string.
    • View the start time and duration for the command run.
    • View the host status page for the command by clicking the host link.
    • View the full log file for the command by selecting the stdout or stderr tab.

    For more information, see Viewing Running and Recent Commands.

  • MapReduce Job details appear after you click the job link.
  • Download the following HDS Replication Reports in CSV format after you click Download CSV:
    • Listing report contains the list of files and directories copied during the replication job.
    • Status report contains the full status report of the files where the replication status is shown as:
      • ERROR occurred during replication, therefore the file was not copied.
      • DELETED for deleted files.
      • SKIPPED for up-to-date files that were not replicated.
    • Error Status Only report contains the status report of all copied files with errors. The file lists the status, path, and message for the copied files with errors.
    • Deleted Status Only report contains the status report of all deleted files. The file lists the status, path, and message for the databases and tables that were deleted.
    • Skipped Status Only report contains the status report of all skipped files. The file lists the status, path, and message for the databases and tables that were skipped.
    • Performance report contains a summary report about the performance of the running replication job. The report includes the last performance sample for each mapper that is working on the replication job.
    • Full Performance report contains the performance report of the job. The report shows the samples taken for all the mappers during the full execution of the replication job.
  • (Dry Run only) Replicable Files shows the number of files that would be replicated during an actual replication.
  • (Dry Run only) Replicable Bytes shows the number of bytes that would be replicated during an actual replication.
  • View the number of Impala UDFs replicated. (Displays only for Hive/Impala replications where Replicate Impala Metadata is selected.)
  • If a user was specified in the Run As Username field when creating the replication job, the selected user appears.
  • View messages returned from the replication job.
Duration Time taken for the replication job to complete.
Outcome Status of the replication job as Successful or Failed.
Files Expected Number of files expected to be copied and its file size based on the parameters of the replication policy.
Files Copied Number of files copied and its file size for the replication job.
Files Failed Number of files that failed to be copied and its file size for the replication job.
Files Deleted Number of files that were deleted and its file size for the replication job
Files Skipped Number of files skipped and its file size for the replication job. The replication process skips files that already exist in the destination and have not changed.

The following sample image shows the historical details about an HDFS replication policy which includes the replication policy name, policy type, source and target cluster details, and the next scheduled run: