Policy Name |
The policy name that will display in the UI |
Maximum length of 64 characters. Spaces, dashes, and underscores are
the only special characters allowed. |
Description |
Any useful information to identify the policy or its use |
|
Service |
Hive or HDFS replication |
For Hive replication, a corresponding Hive database structure must
exist on the destination. For HDFS, the corresponding file system
structure is created when the first replication job executes. |
Source Cluster |
The cluster that contains the data to be replicated |
If the cluster you want is not listed, you need to enable the cluster
for DLM. |
Destination Cluster |
The cluster to which the source data will be replicated |
If the cluster you want is not listed, you need to enable the cluster
for DLM. |
Select a Folder Path (Only if HDFS is selected) |
The HDFS directories available to browse and to select for
replication |
The Infra Admin role has read privileges, in the DLM UI only, for all
HDFS directories on the source and destination clusters. Clusters must
be paired before you can browse HDFS directories in DLM. |
Select Database (Only if Hive is selected) |
The internal or external databases available to browse and to select
for replicated |
The Infra Admin role has read privileges, in the DLM UI only, for all
databases on the source and destination clusters. |
Enable snapshot based replication |
Enables snapshot replication on the selected folder if you have the
required permissions |
When the job runs, snapshots are automatically created on the
destination cluster and managed by DLM. HDFS Admin role is required to
enable snapshots. |
Repeat |
How often you want the job to run |
Choices are weeks, days, hours, or minutes. For a Hive replication
policy, set the frequency so that changes are replicated often enough to
avoid overly large copies. |
Start and End Dates |
The dates you want the job to start (required) and end
(optional) |
If you do not set an end date, the job runs at the set time and
frequency until the job is manually cancelled. |
Start Time |
24-hour clock |
|
Queue Name (Optional) |
The YARN queue you want to use to prioritize job scheduling across
multiple tenants |
If no queue is entered, DLM defaults to the YARN queue identified in
the Ambari View for YARN Capacity Scheduler. You can enter one queue
name per policy. |
Maximum Bandwidth (Optional) |
The maximum bandwidth to be used when running a job based on this
policy |
Enables you to restrict the amount of data throughput to the
specified value. Enter a number in megabytes per second (MBps). |
Maximum Maps |
Sets the maximum number of map tasks (simultaneous copies) per
replication job. |
|