Using Hive to access an existing HBase table example
Use the following steps to access the existing HBase table through Hive.
-
You can access the existing HBase table through Hive using the CREATE EXTERNAL
TABLE:
CREATE EXTERNAL TABLE hbase_table_2(key int, value string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key ,cf1:val") TBLPROPERTIES("hbase.table.name" = "some_existing_table", "hbase.mapred.output.outputtable" = "some_existing_table");
-
You can use different type of column mapping to map the HBase columns to
Hive:
- Multiple Columns and Families
To define four columns, the first being the rowkey: “:key,cf:a,cf:b,cf:c”
- Hive MAP to HBase Column Family
When the Hive datatype is a Map, a column family with no qualifier might be used. This will use the keys of the Map as the column qualifier in HBase: “cf:”
- Hive MAP to HBase Column Prefix
When the Hive datatype is a Map, a prefix for the column qualifier can be provided which will be prepended to the Map keys: “cf:prefix_.*”
Note: The prefix is removed from the column qualifier as compared to the key in the Hive Map. For example, for the above column mapping, a column of “cf:prefix_a” would result in a key in the Map of “a”.
- Multiple Columns and Families
-
You can also define composite row keys. Composite row keys use multiple Hive
columns to generate the HBase row key.
- Simple Composite Row Keys
A Hive column with a datatype of Struct will automatically concatenate all elements in the struct with the termination character specified in the DDL.
- Complex Composite Row Keys and HBaseKeyFactory
Custom logic can be implemented by writing Java code to implement a KeyFactory and provide it to the DDL using the table property key “hbase.composite.key.factory”.
- Simple Composite Row Keys