HBase replication policy definition JSON file

The HBase replication policy definition JSON file contains all the parameters required to create an HBase replication policy.

Parameters in HBase replication policy definition JSON file

The following table lists the parameters in the policy definition JSON file that are required to create an HBase replication policy:

Parameter Description Required?
clusterCrn Provide the target cluster CRN. Replication Manager saves the replication policy in the specified cluster CRN. Required
policyName Provide a unique name for the replication policy. Required
policyDefinition Provide the policy definition parameters as required. Required
hbasePolicyArguments Provide the HBase replication parameters. Required
tables Provide the tables to be replicated where the key must be in the "namespace:tablename" format. Optional
credentialLocation* Provide SAFETY_VALVE if the credentials are available in the advanced configuration snippets (safety valves) on the source cluster, or provide EXTERNAL_ACCOUNT if the credentials are to be defined in the cloudCredential property.

Default is EXTERNAL_ACCOUNT.

Optional
cloudCredential Provide the cloud credentials to use to replicate the initial snapshot if the credentials are defined in an external account. Optional
validateReplicationSetup* Provide true to validate the replication setup after policy creation. Otherwise, provide false. Optional
forceSetup* Provide true to force the first-time setup when one of the clusters is already paired with another cluster.

Forced setup is only possible if there are no HBase replication policies between the pair of clusters or if the other cluster in the pair is currently unreachable.

During the force setup, Replication Manager clears the existing pairing for the selected source or target cluster and initiates the first-time setup with the chosen new source or destination cluster.

Optional
databaseArguments Provide one of the following values for the replicationStrategy parameter:
  • ALL_TABLES
  • TABLES_WITH_REPLICATION_SCOPE_SET

For more information about replication scope, see Creating HBase replication policy.

Optional
sourceCluster Provide the name of the source cluster in the "dataCenterName$cluster name" format. For example, "DC-Europe$My Source 42” Required
targetCluster Provide the name of the target cluster in the "dataCenterName$cluster_name" format. For example, "DC-US$My Destination 17" Required
initialSnapshot Provide true to replicate the existing data in the table. When you provide false, Replication Manager replicates only the data generated after the policy creation. Required
exportSnapshotUser Enter the custom username. Replication Manager uses the specified username to export the initial snapshots to the target. You must map the Kerberos username to an AWS role if you use an IDBroker topology based credential. For more information about user mapping, see Creating HBase replication policy. Required
description Provide a brief description of the replication policy. Optional
machineUser The credentials of the machine user that Replication Manager uses to run HBase replication policies.

Provide the following parameters for the machine user:

  • user
  • password
  • createUser - Provide true to create a new machine user. If you provide false, ensure that the username you provide exists in the Cloudera User Management System (UMS), otherwise an error message appears.

If you do not provide the machine user details, an HBase replication machine user is created automatically with an auto-generated username.

Optional
queueName Provide a YARN queue name to use for the initial snapshot operation.

Default is default.

Optional
distcpMaxMaps Provide the maximum map jobs to use for initial snapshot operation. Optional
distcpMapBandwidth* Adjust the setting so that each map task is throttled to consume only the specified bandwidth.

Default is 100 MB.

Optional
sourceRestartType Provide RESTART or ROLLING_RESTART to restart the HBase service on the source cluster.

Default is ROLLING_RESTART.

Optional
targetRestartType Provide RESTART or ROLLING_RESTART to restart the HBase service on the destination cluster.

Default is ROLLING_RESTART.

Optional
*The option is a technical preview feature and is not ready for production deployment. The components are provided ‘as is’ without warranty or support. Further, Cloudera assumes no liability for the use of preview components, which should be used by customers at their own risk. For more information, contact your Cloudera account team.

Sample HBase replication policy definition JSON file

The following snippet shows the contents of an HBase replication policy definition JSON file:

{
    "clusterCrn": "string",
    "policyName": "string",
    "policyDefinition": {
        "hbasePolicyArguments": {
            "tables": [
                "string"
            ],
            "credentialLocation": "EXTERNAL_ACCOUNT"|"SAFETY_VALVE", (technical preview)
            "cloudCredential": "string",
            "validateReplicationSetup": true|false, (technical preview only)
            "forceSetup": true|false, (technical preview only)
            "databaseArguments": {
                "replicationStrategy": "ALL_TABLES"|"TABLES_WITH_REPLICATION_SCOPE_SET"
            }
    },
            "sourceCluster": "string",
            "targetCluster": "string",
            "initialSnapshot": true|false,
            "exportSnapshotUser": "string",
            "description": "string",
            "machineUser": {
                "user": "string",
                "password": "string",
                "createUser": true|false
            },
            "queueName": "string",
            "distcpMaxMaps": 0,
            "distcpMapBandwidth": 0, (technical preview only)
            "sourceRestartType": "RESTART"|"ROLLING_RESTART",
            "targetRestartType": "RESTART"|"ROLLING_RESTART"
    }
}