Apache Atlas ReferencePDF version

Apache Atlas metadata attributes

Attributes are the key-value pairs that hold metadata details for entities and classifications.

Different types of attributes are populated with values differently.
Technical Attributes
These attributes are the entity fields that contain technical metadata defined in entity models. For the built-in entity types, Atlas collects this information from services on the cluster. These attributes are read-only in the UI but can be updated using the Atlas API. All entity types share basic metadata such as names and qualified names; however, the rest of the technical metadata is specific to the entity type.
System Attributes
These attributes are populated by Atlas when it creates an entity instance. They include:
System Attribute Description Identifier in Advanced Search
Type The Atlas entity type. __typeName
Status The entity status in Atlas: this field indicates if a data asset has been deleted; Atlas maintains the entity information after the asset no longer exists on the cluster. __state
Created By User The Atlas user who created this entity. Typically this is the Atlas system user. If an entity was created by an API call or created manually by users, the active user account would be included in this attribute. __createdBy
Last Modified User The Atlas user who last updated the entity, whether through Atlas metadata collection from a cluster service, an Atlas API, or a change through the Atlas UI. __modifiedBy
Created timestamp The date Atlas created the entity. Note that this field is different from the technical attribute for the creation date of the original data asset or operation. __timestamp
Last Modified timestamp The date when an entity was last updated in Atlas. Note that this field is different from the technical attribute for the last modification date of the actual data asset or operation on the cluster. __modificationTimestamp
GUID A unique identifier generated by Atlas. This is the 32-digit code found in the browser URL for an entity. __guid
Labels Label metadata added to an entity. __labels
User-Defined Properties Key-value pair metadata added to an entity. __customAttributes
Classifications Classifications added to an entity. __classificationNames
Propagated Classifications Classifications added to entities downstream from an entity where the classification was added by a user. __propagatedClassificationNames
A concatenated string of classification names and attributes for an entity. This attribute is not available through the Atlas UI. __classificationsText
IsIncomplete A system indicator that entities were created because they were referenced in the metadata collected by a service other than the source type associated with the entity type. An entity is typically marked "isIncomplete" when Atlas receives metadata out of order from when the events occurred. If IsIncomplete entities remain “incomplete” for a long time, it may indicate that the original messages for entity metadata have not arrived. __isIncomplete
When defining new models, you can take advantage of the isAppendOnPartialUpdate option in attribute definitions. This option allows array or map type attribute values to be updated by appending rather than replacing the entire set. For example, to represent a list of key-value pairs that can be augmented over time, you might define an attribute metadata as a map with the option isAppendOnPartialUpdate set to true:
{
   "name": "metadata",
   "typeName": "map<string,string>",
   "isOptional": true,
   "cardinality": "SINGLE",
   "valuesMinCount": 0,
   "valuesMaxCount": 1,
   "isUnique": false,
   "isIndexable": false,
   "includeInNotification": false,
   "description": "Contains key-value pairs that provide metadata",
   "searchWeight": -1,
   "options": {
   "isAppendOnPartialUpdate": "true"
}

Classifications, labels, and user-defined properties are included as system attributes in the context of search. They are modeled as entity attributes so that when you access an entity (through the UI or API), you get all these entity-specific metadata.

Business Metadata Attributes
These attributes are populated in the Atlas UI or through API calls. They provide a way to extend the metadata stored for entity instances. You can define Business Metadata attributes to apply to a specific entity type or to many entity types. Administrators can control the users or groups who can set values for these attributes by creating a Ranger policy against the Business Metadata collection that contains the attribute.
Classification Attributes
These attributes are populated in the Atlas UI or through API calls. They provide a way to enrich the worth of a classification for searching, for access policies in Ranger, and for organizing cluster data assets.

Classifications can also be assigned to entities through lineage: if the classification is defined to allow lineage propagation, a classification assigned to an entity is also assigned to all entities that have output relationships to the classified entity. The propagation applies to all further generations of the lineage. Note that Atlas distinguishes between classifications that were specifically assigned to an entity and classifications that were assigned through lineage propagation.

User-defined Properties
These attributes are populated in the Atlas UI or through API calls. They allow users to add metadata in the form of key-value pairs to any entity instance. Values are limited to strings. Both key and value are included in searches. They are not centrally managed like classifications or Business Metadata attributes. They are not accessible through Ranger for specifying access policies.
Attribute names can include letters, numbers, underscores, and hyphens; they must start with a letter or number. All attributes can have values one of the following Java data types:
  • string
  • Boolean
  • byte
  • short
  • int
  • float
  • double
  • long
  • date
  • enumeration
Where enumeration type values are strings from pre-defined enumeration defined using the Atlas API.

When you define an attribute, you an indicate that the value can include more than one entry. Atlas records multiple values in a comma-separated list. Thus, when searching on attributes with multiple values, users should use the logical operator "Contains" rather than "=" so the search matches on a single value rather than the whole list.