Fixed Issues in Atlas

Review the list of Apache Atlas issues that are resolved in Cloudera Runtime 7.2.2.

CDPD-1138: Spark Atlas Connector tracks column-level lineage
This issue is now resolved.
CDPD-11790: Shell entity is not resolved to the Complete entity under certain conditions.
Shell entities with duplicate qualifiedName are no longer created when processing message from Spark Atlas Connector. This issue is now resolved.
CDPD-14031: In the Spark Atlas Connector, few S3 entities are created using the V1 S3 model instead of the updated V2 S3 model.
Use Atlas S3 v2 models in Spark Atlas Connector. This issue is now resolved.
OPSAPS-57947: Kafka Broker SSL configuration is not correct in High Availability mode.
When deploying the DataHub in High Availability mode, some of the Ranger and Atlas configurations are not computed correctly. In particular atlas.kafka.security.protocol in Atlas, the SSL properties and the REST URL services depending on Ranger.
CDPD-13645: Contains sortBy=name, 'name' attribute is not in hive_storagedesc definition. Also if sortBy is not passed, default attribute is name
Validated if sortBy attribute passed in the request is present in relationship end definition, if not present, ignore sorting.
Validated if sortBy attribute is not passed, default attribute name is present in relationship end definition, if not present, ignore sorting.
CDPD-10873
1) Fixed quick search aggregation metrics when filtered with System Attributes
2) Fixed quick search aggregation metrics when filtering with more than one filter
3) Fixed quick search aggregation metrics when filtering with negation operator
CDPD-13805: Relationship api request will have provision to specify attributes to be present in search result.
Example Request: /v2/search/relationship?guid=ac9e04cc-f927-4334-af08-c83bc3733f5b&relation=columns&sortBy=name&sortOrder=ASCENDING&attributes=dcProfiledData
CDPD-11681:
1. Filter Search Results with multiple entity type by 'comma' separated string of typeName in the request Eg. "typeName": "hive_table,hive_db".
2. Filter Search Results with multiple tag by 'comma' separated string of tags in the request Eg. "classification": "tag1,tag2".
CDPD-13199: Incorrect attribute values in bulk import
When importing Business Metadata attribute assignments, Atlas used only the last assigned attribute value instead of individual values for each entity in the import list.
CDPD-372: All Spark Queries from the Same Spark Session were included in a Single Atlas Process
A Spark session can include multiple queries. When Atlas reports the Spark metadata, it creates a single process entity to correspond to the Spark session. The result was that an Atlas lineage picture showed multiple input entities or multiple output entities for a process, but the inputs and outputs were only related by the fact that they were included in operations in the same Spark session. In this release, the Spark Atlas Connector produces a spark_application entity for each Spark job. Each data flow produced by the job creates a spark_process entity in Atlas, which tracks the actual input and output data sets for that process. For more information, see Spark metadata collection.
CDPD-12620: Migration progress bar not refreshed
During the import stage of Cloudera Navigator to Apache Atlas migration, the migration progress bar does not correctly refresh the migration status. The Statistics page in the Atlas UI displays the correct details of the migration.
This issue is resolved.
CDPD-10151: Short Spark job processes may be lost
In rare occasions, it is possible for events captured for Atlas by the Spark Atlas Connector to be dropped before the metadata reaches Atlas. It is more likely that an event is lost in very short running jobs.
This issue is resolved.
CDPD-6042: Hive Default Database Location Incorrect in Atlas Metadata
The location of the default Hive database as reported through the HMS-Atlas plugin does not match the actual location of the database. This problem does not affect non-default databases.
This issue is resolved.
CDPD-4662: Lineage graph links not working
Atlas lineage graphs do not include hyperlinks from assets to the assets' detail pages and clicking an asset does not provide an error in the log. Clicking an edge in a graph still provides access to edge behavior options such as controlling how classifications propagate.
This issue is resolved.
CDPD-3700: Missing Impala and Spark lineage between tables and their data files
Atlas does not create lineage between Hive tables and their backing HDFS files for CTAS processes run in Impala or Spark.
This issue is resolved.
Additional Cloudera JIRAs: CDP-5027, CDPD-3700, and IMPALA-9070