8. DistCp Data Copy Matrix: HDP1/HDP2 to HDP2

The following table provides a summary of configuration, settings and results when using DistCp to copy data from HDP1 and HDP2 clusters to HDP2 clusters.

From

To

Source Configuration

Destination Configuration

DistCp Should be Run on...

Result

HDP 1.3

HDP 2.x

insecure + hdfs

insecure + webhdfs

HDP 1.3 (source)

success

HDP 1.3

HDP 2.x

secure + hdfs

secure + webhdfs

HDP 1.3 (source)

success

HDP 1.3

HDP 2.x

secure + hftp

secure + hdfs

HDP 2.x (destination)

success

HDP 1.3

HDP 2.1

secure + hftp

secure + swebhdfs

HDP 2.1 (destination)

success

HDP 1.3

HDP 2.x

secure + hdfs

insecure + webhdfs

HDP 1.3 (source)

Possible issues are discussed here.

HDP 2.x

HDP 2.x

secure + hdfs

insecure + hdfs

secure HDP 2.x (source)

success

HDP 2.x

HDP 2.x

secure + hdfs

secure + hdfs

either HDP 2.x (source or destination)

success

HDP 2.x

HDP 2.x

secure + hdfs

secure + webhdfs

HDP 2.x (source)

success

HDP 2.x

HDP 2.x

secure + hftp

secure + hdfs

HDP 2.x (destination)

success

For the above table:

  • The term "secure" means that Kerberos security is set up.

  • HDP 2.x means HDP 2.0 and HDP 2.1.

  • hsftp is available in both HDP-1.x and HDP-2.x. It adds https support to hftp.