Introduction to Apache HBase

Apache HBase is a scalable, distributed, column-oriented datastore. Apache HBase provides real-time read/write random access to very large datasets hosted on HDFS.

HBase is sparse, distributed, persistent, multidimensional sorted map, which is indexed by rowkey, column key,and timestamp. Fundamentally, it is a platform for storing and retrieving data with random access. HBase is designed to run on a cluster of computers, built using commodity hardware. HBase is also horizontally scalable, which means you can add any number of columns anytime.