Purging deleted entities

You can use Atlas REST API calls to remove entities from Atlas. Only entities that have been deleted in the source system and marked as deleted in Atlas can be purged.

When a data asset is deleted, such as after a DROP TABLE command in Hive, Atlas continues to retain the asset's entity, including metadata, lineage, and audit record. The status of the entity is set to "deleted"; deleted entities show up in search results when the checkbox to Show historical entities is checked. Deleted entities appear in lineage graph dimmed-out.

In some cases, it may be appropriate to remove entities for deleted assets from Atlas. For example, in a development or test environment, you may choose to clean out specific entities rather than clearing the entire Atlas database. Be careful not to purge entities in a production environment without understanding the impact of removing entities on compliance processes in your organization.

Deleted entities can be removed completely from Atlas by using the REST API call PUT /admin/purge.

When you purge a deleted entity:
  • The entity is removed from Atlas.
  • Related, dependent entities are also removed. For example, when purging a deleted Hive table, the deleted entities for the table columns, DLL, and storage description are also purged.
  • The entity is no longer available in search results, even with Show historical entities enabled.
  • Lineage relationships that include the purged entities are removed, which breaks lineages that depend upon a purged entity to show connections between ancestors and descendents.
  • Classifications propagated across the purged entities are removed in all descendent entities.
  • Classifications assigned to the purged entities and set to propagate are removed from all descendent entities.

Note that classifications can propagate to an entity from more than one source; if one source is purged, the classification will remain on the entity as propagated from the other source.

Purged entities cannot be restored.

Atlas retains an audit record of the purge operations, which is available through the REST API call POST /admin/audit. This call allows you to retrieve a list of entities purged in a given time interval. In addition, the Administration > Audit tab in the Atlas UI records entities that were successfully purged.