What's New in Apache Atlas

This topic lists new features for Apache Atlas in this release of Cloudera Runtime.

Process execution entities

Atlas collects metadata for operations that occur in Hive, Impala, and other query engines. In this release, each instance of a query is represented by a process execution entity. The process execution is related to the parent process entity. The result is that you see one lineage diagram for an operation, no matter how many times the same operation runs.

Process executions are listed in the relationship tab of the parent process; they are indicated by the query text with a system generated identifier appended to the end.

Search facet counts

The entries in the basic search drop-down lists now have counts to indicate how many of each classification or entity type exist in Atlas. These counts can help you narrow your searches quickly to reach the best results.

Atlas server and message statistics

Atlas collects statistics on the metadata it processes, such as the rate of messages received, which message was most recently processed, the count and distribution of entities created. Use this information to gauge metadata collection performance and volume and to help troubleshoot problems.

Atlas Reference: Statistics

Impala support

Atlas collects metadata from Impala. Impala operations are represented by Atlas processes; when the same query is run more than once, Atlas creates process execution entities to capture the volume of activity. Atlas collects metadata through Hive Metastore (HMS) for the Hive data assets that Impala operates against.

Atlas Reference: Impala Metadata Collection