HBase is a key-value store. Using keys as an index allows fast table scanning and data retrieval at petabyte scale. A group of contiguous table rows is a called a region. A new table has only one region, and HBase dynamically adds regions as the table grows. The keys for a region are assigned to one or more region servers. In the illustration below, the keys and values for rows 10 through 40 are assigned to region server 1, which "owns" the data for these keys and can both read and write new values for keys in its region. The keys and values served by region server 1 is called its primary region. If region server 1 crashes, keys 10-40 are unavailable. Region Server 2 serves another primary region consisting of keys 40 through 60, and region server 3 serves the primary region of keys 60 through 100.
This release of HBase allows HBase administrators and developers to replicate a primary region to one or more additonal region servers. The replicated data is referred to as a secondary region. In the illustration below, the region defined by keys 10 through 20 is replicated to a secondary region on region server 2. If region server 1 crashes or becomes unavailable, region server 2 can provide read-only access to the data. Primary regions provide both write and read access, but secondary regions provide only read access.
Note | |
---|---|
The data returned by a secondary region may be slightly stale, as described in the following section. |