HDFS data migration from HDP to Cloudera
Use the DistCp tool to migrate the data.
An example DistCp command:
$ sudo -u hdfs hadoop distcp -Dfs.s3a.access.key=###########
-Dfs.s3a.secret.key=######### -Dfs.s3a.server-side-encryption-algorithm=SSE-KMS
-Dfs.s3a.server-side-encryption.key=arn:aws:kms:<region>:########### /data/datafiles/
s3a://demo2-cdp-private-default-saas/cdp-saas/hdp-data/
Additional configuration:
The following AWS S3 specific configurations are recommended, apart from configuring the default settings like the numbers of mappers and bandwidth per mapper.
fs.s3a.fast.upload=true
, this configuration allows to speed up the data transfer- Setting up s3 proxy server configurations
fs.s3a.proxy.*
configs fs.s3a.connection.ssl.enabled=true
, this configuration enables SSL connection to the S3 endpoint.