Before you create a Hive ACID table replication policy, you must prepare the clusters
for replication.
To perform Hive ACID table replication using Replication Manager, Cloudera Manager Server must manage the target cluster. You can
use the same server or a peer Cloudera Manager Server to manage
the source cluster. Hive ACID table replication policies use Hive scheduler to
schedule the frequency of replication policy job runs.
- Replication Manager requires a valid license. To understand more about Cloudera
license requirements, see Managing Licenses.
- Minimum required role: Replication Administrator or Full
Administrator.
- Before you create replication policies, ensure that the source cluster and
target cluster are supported by Replication Manager. For information about
supported clusters and supported replication scenarios by Replication Manager,
see Support matrix for Cloudera Base on premises
Replication Manager.
- Before you use a source cluster that is the target for another replication job,
you must ensure that you reset the repl.target.for database
property for the source database using the ALTER DATABASE [***
DATABASE NAME ***] SET DBPROPERTIES('repl.target.for'='');
statement.
-
Set up a two-way trust between the Cloudera Private Cloud Base
clusters. For more information, see Configure two-way trust between clusters
-
Configure a peer relationship only if the source cluster is managed
by a different Cloudera Manager server than the target
cluster. For more information, see Configuring a peer
relationship.
-
Ensure that the hive user and the hive group have
0755 port permission to the staging location if the target cluster uses Dell EMC
Isilon storage.
-
Configure the hive.repl.cm.enabled=true key-value pair
on the source cluster for the following services to turn on the
ChangeManager:
| Service |
Action |
| Hive-on-Tez For example,
Hive-on-Tez-1 |
On the Configuration tab, search
for Hive Service Advanced Configuration Snippet
(Safety Valve) for hive-site.xml property
and set the key-value pair |
| Hive For example,
Hive-1 |
On the Configuration tab, search
for Hive Service Advanced Configuration Snippet
(Safety Valve) for hive-site.xml property
and set the key-value pair. |
| Hive For example,
Hive-1 |
On the Configuration tab, search
for Enable ChangeManager for Hive
replication parameter and select it. |
-
Configure Hive configuration parameters for Hive ACID tables. For
more information, see Advanced Hive configuration parameters for Hive ACID table replication policies.
-
Ensure that you enable HDFS trash before you create Hive ACID table replication
policies. For more information about HDFS trash, see Configuring HDFS trash.
-
Enable the Hive ACID table replication feature flag on the source and
target cluster.
For more information, contact your Cloudera account team.
-
Ensure that you disable compaction on the target
cluster.
-
Complete the following steps if LDAP authorization is enabled:
-
Go to the tab.
-
Choose Enable LDAP Authentication for
HiveServer2.
-
Enter the LDAP URL in the
ldap[s]://[***HOST***]:[***PORT***]
format.
-
Enter the base LDAP distinguished name (DN) for the LDAP server in
LDAP BaseDN. For example,
ou=dev, dc=xyz.
-
Restart the service.
-
Enter the LDAP username in the
config.ldap.repl.user.display_name (or
hiveserver2_ldap_replication_user)
property.
-
Enter the LDAP password in the
config-ldap-repl.password.display_name (or
hiveserver2_ldap_replication_password)
property.
-
Save the configuration.