Enable Snapshots on Directories
You can use snapshot-enabled (snapshottable) HDFS directories for replication on the source and destination clusters. You must manually enable the directories for snapshots. Note that you do not need to create these directories for Apache Hive.
See Replication Concepts for an overview of how snapshots work and considerations when implementing snapshots.
About This Task
You must have HDFS superuser access on the source and destination clusters to perform this task.
After you create the snapshot-enabled directories and begin replication jobs, snapshots
are created on the source and destination clusters. The snapshots are maintained in
directories named .snapshot
.
The three most recent snapshots are retained by default. Older snapshots are automatically deleted.
Steps
As HDFS superuser, log into a terminal on the source cluster.
Create a directory in which to store snapshots on the source:
$ hadoop dfs -mkdir <source directory path>
Set the source directory permissions to read/write/execute (777) for everyone:
hdfs dfs -chmod -R 777 <source directory path>
Enable snapshots on the source directory:
$ hdfs dfsadmin -allowSnapshot <source directory path>
If the Beacon (DLM Engine) user is not configured as HDFS superuser, then give ownership of the directory to the DLM Engine:
hadoop dfs -chown -R beacon <source directory path>
As HDFS superuser, log in to a console on the destination cluster and repeat Step 2 through Step 5 to create a target directory to store snapshots on the destination.