Security for Kudu replication

Learn about configuring Kerberos authentication, Ranger authorization, and classloader isolation for the Kudu replication job.

Kerberos authentication

The Kudu Java client negotiates Kerberos authentication automatically by using the ambient Kerberos ticket cache through GSSAPI or JAAS. The replication job configuration contains no explicit keytab or principal properties.

When you run the job on YARN, you must follow these requirements:

  1. Ensure a valid Kerberos ticket exists for the submitting user on the node where you run the flink run command.
  2. YARN propagates Hadoop delegation tokens to the JobManager and TaskManager containers automatically.
  3. For long-running jobs where the environment cannot guarantee ticket renewal, you can use the built-in filnk keytab support. To provide credentials explicitly, pass the following -D flags at submission time:
    flink run-application -t yarn-application \
      -Dsecurity.kerberos.login.keytab=/path/to/user.keytab \
      -Dsecurity.kerberos.login.principal=user@REALM \
      -Dclassloader.parent-first-patterns.additional=org.apache.kudu \
      -c org.apache.kudu.replication.ReplicationJob \
      kudu-replication-<version>.jar
    

Authorization using Ranger

If the Kudu cluster uses Ranger integration, the user submitting the replication job must have the following privileges:

Resource Required privilege
Source table select, metadata
Sink database create (only if job.createTable=true)
Sink table all

Sink cluster protection

Cloudera recommends restricting write access on the sink cluster so that only the replication service user can write to replicated tables. This restriction prevents accidental data manipulation on the disaster recovery (DR) target and avoids data divergence between the source and sink.

The following table shows the recommended Ranger policy setup for the sink cluster:

Principal Resource Privilege Rationale
Replication service user Sink tables all Enables the replication job to perform upsert and delete operations.
All other users Sink tables select, metadata Enables read-only queries without the risk of accidental writes.
All other users Sink tables None Explicitly denies or omits insert, update, and delete privileges.

Classloader isolation

To avoid classloader conflicts between the Kudu client and the Flink child-first classloader, you must pass the following argument during job submission:

-Dclassloader.parent-first-patterns.additional=org.apache.kudu

This property ensures that the same parent classloader loads both the Kudu classes and the Hadoop security classes, which prevents ClassCastException errors during Kerberos negotiation.