Apache Atlas metadata attributes
Attributes are the key-value pairs that hold metadata details for entities and classifications.
- Technical Attributes
- These attributes are the entity fields that contain technical metadata defined in entity models. For the built-in entity types, Atlas collects this information from services on the cluster. These attributes are read-only in the UI but can be updated using the Atlas API. All entity types share basic metadata such as names and qualified names; however, the rest of the technical metadata is specific to the entity type.
- System Attributes
- These attributes are populated by Atlas when it creates an entity
instance. They include:
System Attribute Description Identifier in Advanced Search Type The Atlas entity type. __typeName
Status The entity status in Atlas: this field indicates if a data asset has been deleted; Atlas maintains the entity information after the asset no longer exists on the cluster. __state
Created By User The Atlas user who created this entity. Typically this is the Atlas system user. If an entity was created by an API call or created manually by users, the active user account would be included in this attribute. __createdBy
Last Modified User The Atlas user who last updated the entity, whether through Atlas metadata collection from a cluster service, an Atlas API, or a change through the Atlas UI. __modifiedBy
Created timestamp The date Atlas created the entity. Note that this field is different from the technical attribute for the creation date of the original data asset or operation. __timestamp
Last Modified timestamp The date when an entity was last updated in Atlas. Note that this field is different from the technical attribute for the last modification date of the actual data asset or operation on the cluster. __modificationTimestamp
GUID A unique identifier generated by Atlas. This is the 32-digit code found in the browser URL for an entity. __guid
Labels Label metadata added to an entity. __labels
User-Defined Properties Key-value pair metadata added to an entity. __customAttributes
Classifications Classifications added to an entity. __classificationNames
Propagated Classifications Classifications added to entities downstream from an entity where the classification was added by a user. __propagatedClassificationNames
— A concatenated string of classification names and attributes for an entity. This attribute is not available through the Atlas UI. __classificationsText
IsIncomplete A system indicator that entities were created because they were referenced in the metadata collected by a service other than the source type associated with the entity type. An entity is typically marked "isIncomplete" when Atlas receives metadata out of order from when the events occurred. If IsIncomplete entities remain “incomplete” for a long time, it may indicate that the original messages for entity metadata have not arrived. __isIncomplete
Classifications, labels, and user-defined properties are included as system attributes in the context of search. They are modeled as entity attributes so that when you access an entity (through the UI or API), you get all these entity-specific metadata.
- Business Metadata Attributes
- These attributes are populated in the Atlas UI or through API calls. They provide a way to extend the metadata stored for entity instances. You can define Business Metadata attributes to apply to a specific entity type or to many entity types. Administrators can control the users or groups who can set values for these attributes by creating a Ranger policy against the Business Metadata collection that contains the attribute.
- Classification Attributes
- These attributes are populated in the Atlas UI or through API
calls. They provide a way to enrich the worth of a classification
for searching, for access policies in Ranger, and for organizing
cluster data assets.
Classifications can also be assigned to entities through lineage: if the classification is defined to allow lineage propagation, a classification assigned to an entity is also assigned to all entities that have output relationships to the classified entity. The propagation applies to all further generations of the lineage. Note that Atlas distinguishes between classifications that were specifically assigned to an entity and classifications that were assigned through lineage propagation.
- User-defined Properties
- These attributes are populated in the Atlas UI or through API calls. They allow users to add metadata in the form of key-value pairs to any entity instance. Values are limited to strings. Both key and value are included in searches. They are not centrally managed like classifications or Business Metadata attributes. They are not accessible through Ranger for specifying access policies.
Defining attributes
- string
- Boolean
- byte
- short
- int
- float
- double
- long
- date
- enumeration
When you define an attribute, you an indicate that the value can include more than one entry. Atlas records multiple values in a comma-separated list. Thus, when searching on attributes with multiple values, users should use the logical operator "Contains" rather than "=" so the search matches on a single value rather than the whole list.