Kerberos setup guidelines for Distcp between secure clusters (without cross-realm authentication)
- You have two clusters with the realms:
SOURCE
andDESTINATION
- You have data that needs to be copied from
SOURCE
toDESTINATION
- Trust exists between
SOURCE
and Active Directory, andDESTINATION
and Active Directory. - Both
SOURCE
andDESTINATION
clusters are running CDH 5.3.4 or higher
If your environment matches the one described above, use the following table to configure
Kerberos delegation tokens on your cluster so that you can successfully
distcp
across two secure clusters. Based on the direction of the
trust between the SOURCE
and DESTINATION
clusters, you
can use the mapreduce.job.hdfs-servers.token-renewal.exclude
property to instruct ResourceManagers on either cluster to skip or perform delegation
token renewal for NameNode hosts.
Environment Type | Kerberos Delegation Token Setting | |
---|---|---|
SOURCE trusts
DESTINATION |
Distcp job runs on the DESTINATION cluster |
You do not need to set the
mapreduce.job.hdfs-servers.token-renewal.exclude
property. |
Distcp job runs on the SOURCE cluster |
Set the
mapreduce.job.hdfs-servers.token-renewal.exclude
property to a comma-separated list of the hostnames of the NameNodes of
the DESTINATION cluster. |
|
DESTINATION trusts
SOURCE |
Distcp job runs on the DESTINATION cluster |
Set the
mapreduce.job.hdfs-servers.token-renewal.exclude
property to a comma-separated list of the hostnames of the NameNodes of
the SOURCE cluster. |
Distcp job runs on the SOURCE cluster |
You do not need to set the
mapreduce.job.hdfs-servers.token-renewal.exclude
property. |
|
Both SOURCE and DESTINATION trust
each other |
You do not need to set the
mapreduce.job.hdfs-servers.token-renewal.exclude
property. |
|
Neither SOURCE nor DESTINATION
trusts the other |
If a common realm is usable (such as
Active Directory), set the
mapreduce.job.hdfs-servers.token-renewal.exclude
property to a comma-separated list of hostnames of the NameNodes of the
cluster that is not running the distcp job. For example, if you
are running the job on the DESTINATION cluster:
|