View historical details for an HDFS replication policy
You can view the historical details about the replication jobs on the Replication
History page.
The following table lists the columns that appear on the Replication
History page when you click Actions > Show History to view the previously run replication jobs:
Column
Description
Start Time
Shows the job details.
Expand the section to view the following job details:
Started At timestamp is when the replication job
started.
Duration to complete the job.
Command Details appear in a new tab after you click
View.
The Command Details
page shows the details and messages about each step during the command run.
Click Context to view the service status page relevant
to the command, and click Download to download the
summary as a JSON file.
Expand Step to choose
Show All Steps, Show Only Failed
Steps, or Show Only Running Steps. You
can perform the following tasks in this section:
View the actual command string.
View the start time and duration for the command run.
View the host status page for the command by clicking the host
link.
View the full log file for the command by selecting the
stdout or stderr tab.
MapReduce Job details appear after
you click the job link.
Download the following HDS Replication Reports in CSV
format after you click Download CSV:
Listing report contains the list of files and
directories copied during the replication job.
Status report contains the full status report of
the files where the replication status is shown as:
ERROR occurred during replication, therefore the
file was not copied.
DELETED for deleted files.
SKIPPED for up-to-date files that were not
replicated.
Error Status Only report contains the status report
of all copied files with errors. The file lists the status, path, and
message for the copied files with errors.
Deleted Status Only report contains the status
report of all deleted files. The file lists the status, path, and message
for the databases and tables that were deleted.
Skipped Status Only report contains the status
report of all skipped files. The file lists the status, path, and message
for the databases and tables that were skipped.
Performance report contains a summary report about
the performance of the running replication job. The report includes the last
performance sample for each mapper that is working on the replication job.
Full Performance report contains the performance
report of the job. The report shows the samples taken for all the mappers
during the full execution of the replication job.
(Dry Run only) Replicable Files shows the number of
files that would be replicated during an actual replication.
(Dry Run only) Replicable Bytes shows the number of
bytes that would be replicated during an actual replication.
View the number of Impala UDFs replicated. (Displays
only for Hive/Impala replications where Replicate Impala
Metadata is selected.)
If a user was specified in the Run As Username field
when creating the replication job, the selected user appears.
View messages returned from the replication job.
Duration
Time taken for the replication job to complete.
Outcome
Status of the replication job as Successful or
Failed.
Files Expected
Number of files expected to be copied and its file size based on the parameters
of the replication policy.
Files Copied
Number of files copied and its file size for the replication job.
Files Failed
Number of files that failed to be copied and its file size for the replication
job.
Files Deleted
Number of files that were deleted and its file size for the replication
job
Files Skipped
Number of files skipped and its file size for the replication job. The
replication process skips files that already exist in the destination and have not
changed.
The following sample image shows the historical details about an HDFS replication policy
which includes the replication policy name, policy type, source and target cluster details,
and the next scheduled run: