What's New in Apache Atlas
This topic lists new features for Apache Atlas in this release of Cloudera Runtime.
Spark support
Atlas collects metadata from Spark 1 and Spark 2. Spark operations are represented by Atlas processes; when the same query is run more than once, Atlas creates process execution entities to capture the volume of activity. Atlas collects metadata through Hive Metastore (HMS) for the Hive data assets that Spark operates against.
Free Text Search across all entity types
Atlas provides a free-text search box that matches search criteria across all entity types and all string attributes for all entities. The top 5 matching results appear as menu selections; all results are returned ranked by which attributes matched the search terms. The text search also offers suggestions, which are the entities with the most matches across the most important attributes, such as names and descriptions.
Process execution entities
Atlas collects metadata for operations that occur in Hive, Impala, and other query engines. In this release, each instance of a query is represented by a process execution entity. The process execution is related to the parent process entity. The result is that you see one lineage diagram for an operation, no matter how many times the same operation runs.
Process executions are listed in the relationship tab of the parent process; they are indicated by the query text with a system generated identifier appended to the end.
Search facet counts
The entries in the basic search drop-down lists now have counts to indicate how many of each classification or entity type exist in Atlas. These counts can help you narrow your searches quickly to reach the best results.
Atlas server and message statistics
Atlas collects statistics on the metadata it processes, such as the rate of messages received, which message was most recently processed, the count and distribution of entities created. Use this information to gauge metadata collection performance and volume and to help troubleshoot problems.
Impala support
Atlas collects metadata from Impala. Impala operations are represented by Atlas processes; when the same query is run more than once, Atlas creates process execution entities to capture the volume of activity. Atlas collects metadata through Hive Metastore (HMS) for the Hive data assets that Impala operates against.