3.1. HDFS Behavioral Changes

HDFS-6376  distcp can copy data between HA clusters. You can use the new configuration property dfs.internal.nameservices to explicitly specify the name services belonging to the local cluster, while continue using the configuration property dfs.nameservices to specify all the name services in the local and remote clusters.

For example, if you perform distcp between two clusters, for example c1 and c2 and both of which run in the HA mode, make the following configuration changes on the cluster that runs the distcp job:

  1. Set dfs.nameservices to both c1 and c2.

  2. Modify the configuration to include the settings for both clusters. See the comments in HDFS-6376  for more details.

  3. Set dfs.internal.nameservice to the nameservice id that corresponds to the cluster that runs the distcp job.