HBase entities created in Atlas

Each HBase data set entity in Atlas includes detailed metadata collected from HBase.

The following diagrams show a summary of the entities created in Atlas for Hive operations and assets. The supertypes that contribute attributes to the entity types are shaded.

Figure 1. Atlas Entity Types for HBase Data Sets


The metadata collected for each entity type is as follows:

HBase Namespace

Identifier Example content
typeName hbase_namespace
guid System generated ID. This value is used to identify the entity in the Atlas Dashboard URL.
qualifiedName <name>@<clustername>
name Namespace name as reported from HBase.
description String description metadata from HBase.
modifiedTime Time from HBase indicating a change to the namespace. Formatted as in this example: “Wed Apr 17 2019 18:32:14 GMT-0700 (Pacific Daylight Time)”
owner Owner as reported from HBase.
parameters Reserved for future use.
replicatedFrom Reserved for future use.
replicatedTo Reserved for future use.
Relationship: tables One namespace to many tables. hbase_table_namespace

HBase Table

Identifier Example content
typeName hbase_table
guid System generated ID. This value is used to identify the entity in the Atlas Dashboard URL.
qualifiedName <namespace>:<tablename>@<clustername>
name Table name as reported from HBase.
description String description metadata from HBase.
modifiedTime Time from HBase indicating a change to the table. Formatted as in this example: “Wed Apr 17 2019 18:32:14 GMT-0700 (Pacific Daylight Time)”
owner Owner as reported from HBase.
parameters Reserved for future use.
replicatedFrom Reserved for future use.
replicatedTo Reserved for future use.
durability Storage property as reported from HBase. Values include true or false.
isCompactionEnabled Storage property as reported from HBase. Values include true or false.
isNormalizationEnabled Storage property as reported from HBase. Values include true or false.
isReadOnly
maxFileSize Storage property as reported from HBase. -1 indicates that no maximum was set.
uri Table name.
Relationship: namespace One namespace to many tables. hbase_table_namespace
Relationship: column families Column families associated with this table. hbase_table_column_families

HBase Column Family

Identifier Example content
typeName hbase_column_family
guid System generated ID. This value is used to identify the entity in the Atlas Dashboard URL.
qualifiedName

<namespace>:<tablename>.<columnfamily>@<clustername>

name Column family name as reported from HBase.
description String description metadata from HBase.
modifiedTime Time from HBase indicating a change to the column family. Formatted as in this example: “Wed Apr 17 2019 18:32:14 GMT-0700 (Pacific Daylight Time)”
owner Owner as reported from HBase.
StoragePolicy Value for the storagePolicy property for the column family. Values include N/A, ALL_SSD, ONE_SSD, HOT, WARM, COLD.
blockCacheEnabled Storage property as reported from HBase. Values include true or false.
bloomFilterType Value for the BLOOM_FILTER_TYPE property for the column family. Values include NONE, ROW, or ROWCOL.
cacheBloomsOnWrite Boolean value for the CACHE_BLOOMS_ON_WRITE property for the column family.
cacheDataOnWrite Boolean value for the CACHE_DATA_ON_WRITE property for the column family.
cacheIndexesOnWrite Boolean value for the CACHE_INDEX_ON_WRITE property for the column family.
columns List of columns included in the column family.
compactionCompressionType Storage property as reported from HBase.
compressionType Value for the COMPRESSION property for the column family. Values include NONE, SNAPPY, LZO, LZ4, GZ.

The default value is SNAPPY.

createTime Time from HBase indicating when the column family was created. Formatted as in this example: “Wed Apr 17 2019 18:32:14 GMT-0700 (Pacific Daylight Time)”
dataBlockEncoding The DATA_BLOCK_ENCODING property for the column family. Values include NONE, PREFIX, DIFF, FAST_DIFF, ROW_INDEX_V1.
encryptionType Column family encryption property. Values include “N/A”, and AES.
evictBlocksOnClose Boolean value for the EVICT_BLOCKS_ON_CLOSE property for the column family.
inMemoryCompactionPolicy In memory compaction behavior (IN_MEMORY_COMPACTION) set for the column family. Values include NONE, BASIC, EAGER, ADAPTIVE, or “N/A”.
isMobEnabled Boolean value for Medium OBject (MOB) properties for the column family (IS_MOB).
keepDeletedCells Boolean value for the KEEP_DELETED_CELLS property of the column family.
maxVersions The maximum number of row versions this column family is configured to store.
minVersions The minimum number of row versions this column family is configured to store.
mobCompactPartitionPolicy The MOB_COMPACT_PARTITION_POLICY for this column family. Values include DAILY, WEEKLY, MONTHLY.
modifiedTime Time from HBase indicating a change to the column family. Formatted as in this example: “Wed Apr 17 2019 18:32:14 GMT-0700 (Pacific Daylight Time)”
newVersionBehavior Boolean value for the NEW_VERSION_BEHAVIOR property for this column family.
prefetchBlocksOnOpen Boolean value for PREFETCH_BLOCKS_ON_OPEN property of the column family.
replicatedFrom Not used.
replicatedTo Not used.
table The table that the column family corresponds to. Also modeled as a relationship.
ttl Time to live (TTL) length in seconds. The TTL time encoded in the HBase for the row is specified in UTC.
Relationship: columns Not used.
Relationship: table One table to many column families. hbase_table_column_families