Enabling Replication Between Clusters in Different Kerberos Realms

Minimum Required Role: Cluster Administrator (also provided by Full Administrator)

To enable replication between clusters that reside in different Kerberos realms, additional setup steps are required to ensure that the source and destination clusters can communicate.

Ports

When using BDR with Kerberos authentication enabled, BDR requires all the ports listed on the following page: Port Requirements for Backup and Disaster Recovery.

Additionally, the port used for the Kerberos KDC Server and KRB5 services must be open to all hosts on the destination cluster. By default, this is port 88.

Considerations for Realm Names

If the source and destination clusters each use Kerberos for authentication, use one of the following configurations to prevent conflicts when running replication jobs:
  • If the clusters do not use the same KDC (Kerberos Key Distribution Center), Cloudera recommends that you use different realm names for each cluster. Additionally, if you are replicating across clusters in two different realms, see the steps for HDFS and Hive replication later in this topic to setup trust between those clusters.
  • You can use the same realm name if the clusters use the same KDC or different KDCs that are part of a unified realm, for example where one KDC is the master and the other is a slave KDC.

HDFS Replication

  1. On the hosts in the destination cluster, ensure that the krb5.conf file (typically located at /etc/kbr5.conf) on each host has the following information:
    • The kdc information for the source cluster's Kerberos realm. For example:
      [realms]
       SOURCE.MYCO.COM = {
        kdc = src-kdc-1.src.myco.com:88
        admin_server = src-kdc-1.src.myco.com:749
        default_domain = src.myco.com
       }
       DEST.MYCO.COM = {
        kdc = dest-kdc-1.dest.myco.com:88
        admin_server = dest-kdc-1.dest.myco.com:749
        default_domain = dest.myco.com
       }
    • Domain/host-to-realm mapping for the source cluster NameNode hosts. You configure these mappings in the [domain_realm] section. For example, to map two realms named SRC.MYCO.COM and DEST.MYCO.COM, to the domains of hosts named hostname.src.myco.com and hostname.dest.myco.com, make the following mappings in the krb5.conf file:
      [domain_realm]
       .src.myco.com = SRC.MYCO.COM
       src.myco.com = SRC.MYCO.COM
       .dest.myco.com = DEST.MYCO.COM
       dest.myco.com = DEST.MYCO.COM
  2. On the destination cluster, use Cloudera Manager to add the realm of the source cluster to the Trusted Kerberos Realms configuration property:
    1. Go to the HDFS service.
    2. Click the Configuration tab.
    3. In the search field type "Trusted Kerberos" to find the Trusted Kerberos Realms property.
    4. Enter the source cluster realm.
    5. Click Save Changes to commit the changes.
  3. If your Cloudera Manager release is 5.0.1 or lower, restart the JobTracker to enable it to pick up the new Trusted Kerberos Realm settings. Failure to restart the JobTracker prior to the first replication attempt may cause the JobTracker to fail.

Hive/Impala Replication

  1. Perform the procedure described in the previous section, including restarting the JobTracker.
  2. On the hosts in the source cluster, ensure that the krb5.conf file on each host has the following information:
    • The kdc information for the destination cluster's Kerberos realm.
    • Domain/host-to-realm mapping for the destination cluster NameNode hosts.
  3. On the source cluster, use Cloudera Manager to add the realm of the destination cluster to the Trusted Kerberos Realms configuration property.
    1. Go to the HDFS service.
    2. Click the Configuration tab.
    3. In the search field type "Trusted Kerberos" to find the Trusted Kerberos Realms property.
    4. Enter the destination cluster realm.
    5. Click Save Changes to commit the changes.

    It is not necessary to restart any services on the source cluster.