Mirroring Data (Falcon)
Mirroring data produces an exact copy of the data and keeps both copies synchronized. You can use Falcon to mirror HDFS directories or Hive tables and you can mirror between HDFS and Amazon S3 or Microsoft Azure. A whole database replication can be performed with Hive.
To mirror data with the Falcon web UI:
Launch the Falcon web UI. If you are using Ambari:
On the Services tab, select Falcon in the services list.
At the top of the Falcon service page, click Quick Links, and then click Falcon Web UI.
At the top of the Falcon web UI page, click Mirror.
On the New Mirror page, specify the following values:
Table 2.4. Mirror Configuration Values
Value
Description
Mirror Name
Name of the mirror entity.
Tags
Metadata tagging. An example is provided in the UI.
Mirror Type
Select whether this is a File System or Hive catalog mirror type.
Source
Specify the location, name, and path of the cluster or Hive table that is to be mirrored, and specify if the mirroring job runs on the source cluster.
Target
Specify the location, name, and path where the mirrored cluster is stored, and specify if the mirroring job runs on the target cluster.
Validity
Specify the validity interval.
Advanced Options
Expand the Advanced Options section of the page to configure how often the target cluster is updated, throttle distcp operations, set a retry policy, and specify the ACL for the mirror entity.
Click Next to view a summary of your mirror entity definition.
If you are satisfied with the mirror entity definition, click Save.
To verify that you successfully created the mirror entity, enter the mirror entity name in the Falcon web UI Search well and press Enter. If the mirror entity name appears in the search results, it was successfully created. See Search For and Manage Data Pipeline Entities.