Removing PREFIX_TREE Data Block Encoding

Before migrating to COD CDP Public Cloud, ensure that you have transitioned all the data to a supported encoding type.

The PREFIX_TREE data block encoding code is removed in COD CDP Public Cloud, meaning that HBase clusters with PREFIX_TREE enabled will fail. Therefore, before migrating to COD CDP Public Cloud you must ensure that all data has been transitioned to a supported encoding type.

The following pre-upgrade command is used for validation: hbase pre-upgrade validate-dbe

If your cluster is Kerberized, ensure that you run the kinit command as a hbase user before running the pre-upgrade commands. Ensure you have valid Kerberos credentials. You can list the Kerberos credentials using the klist command, and you can obtain credentials using the kinit command.

  1. Download and distribute parcels for the target version of CDP.

    If the downloaded parcel version is higher than the current Cloudera Manager version, the following error message displayed:

    Error for parcel CDH-7.x.parcel : Parcel version 7.X is not supported by this version of Cloudera Manager. Upgrade Cloudera Manager to at least 7.X before using this version of the parcel.

    You can safely ignore this error message.

  2. Find the installed parcel at /opt/cloudera/parcels.

    For example: /opt/cloudera/parcels/CDH-7.1.1-1.cdh7.1.1.p0.3224867/bin/hbase

Use the CDP parcel to run the pre-upgrade commands. Cloudera recommends that you run them on an HMaster host.

  1. Run the hbase pre-upgrade validate-dbe command using the new installation.
    For example, if you have installed the CDH-7.1.1-1 parcel, you must run the following command:
    /opt/cloudera/parcels/CDH-7.1.1-1.cdh7.1.1.p0.3224867/bin/hbase pre-upgrade validate-dbe

    The commands check whether your table or snapshots use the PREFIX_TREE Data Block Encoding.

    This command does not take much time to run because it validates only the table-level descriptors.

    If PREFIX_TREE Data Block Encoding is not used, the following message is displayed:
    The used Data Block Encodings are compatible with HBase 2.0.

    If you see this message, your data block encodings are compatible, and you do not have to perform any more steps.

    If PREFIX_TREE Data Block Encoding is used, a similar error message is displayed:
    2018-07-13 09:58:32,028 WARN  [main] tool.DataBlockEncodingValidator: Incompatible DataBlockEncoding for table: t, cf: f, encoding: PREFIX_TREE

    If you got an error message, continue with Step 2 and fix all your PREFIX_TREE encoded tables.

  2. Fix your PREFIX_TREE encoded tables using the old installation.

    You can change the Data Block Encoding type to PREFIX, DIFF, or FAST_DIFF in your source cluster.

    Our example validation output reported column family f of table t is invalid. Its Data Block Encoding type is changed to FAST_DIFF in this example:

    hbase> alter 't', { NAME => 'f', DATA_BLOCK_ENCODING => 'FAST_DIFF' }