HDFS cloud replication

Replication Manager supports replication of HDFS data from cluster to cloud storage and vice versa. The replication policy runs on the cluster and pushes the data from cloud storage.

The cluster can be an On-premise or IaaS cluster with data on local HDFS. The cluster requires HDFS, YARN, Ranger, Knox and DLM Engine services.
  • Atlas entities related to HDFS directory are replicated. If there are no HDFS path entities are present within Atlas, they are first created and then exported.