You can import your data into your Cloudera Operational Database (COD) database by
restoring your HBase table into COD.
- Enable HBase replication on your COD cluster. For more information, see
Cloudera Operational Database data replication.
- Have a location in cloud storage (for example, S3 or ABFS) with an exported
snapshot in it, and have the name of the snapshot.
If you do not already have
an exported HBase snapshot, you can export your data to cloud storage using
the following
command:
hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot [***SNAPSHOT NAME***] -copy-to [***CLOUD STORAGE LOCATION***] -mappers 10
For example, the
data-from-onprem
snapshot can be exported into
s3a://cod-external-bucket/hbase
:
hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot data-from-onprem -copy-to s3a://cod-external-bucket/hbase -mappers 10
- Have and edge node with a configured HBase client tarball and know how to launch
the hbase shell from it. For more information, see Launching HBase
shell.
-
Get your CLOUD STORAGE LOCATION for your COD database
using the COD web interface.
s3a://my_cod_bucket/cod-12345/hbase
-
Add your bucket to the IAM policy used by IDBroker.
For more information, see one of the following documentation:
-
Launch the hbase shell from the edge node.
For more information, see Launching HBase shell.
-
From the edge node, run the
ExportSnapshot
command. Use the
external bucket as the source location and the COD cloud storage as the target.
For example:
$ cd $HBASE_HOME
$ ./bin/hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot “data-from-onprem” --copy-from s3a://cod-external-bucket/hbase --copy-to s3a://my_cod_bucket/cod-12345/hbase
-
Use the
list_snapshots
command and verify that your snapshot
is listed.
$ cd $HBASE_HOME
$ ./bin/hbase shell
$ hbase> list_snapshots
[‘snapshot_name’]
-
Use the
restore_snapshot
or the
clone_snapshot
command to reconstitute the table.
restore_snapshot
: Overwrites current table state with
that of the snapshot. It means that any data modification applied after
the snapshot was taken would be lost. If the table does not exist in the
given cluster, the command automatically creates it.
clone_snapshot
: Accepts a new table name for the table
in which it restores the table schema and data.
-
Validate that all rows are present in the table using the
hbase
rowcounter
or count
command in the hbase
shell.