Known Issues in Apache Atlas
This topic describes known issues and workarounds for using Atlas in this release of Cloudera Runtime.
- Atlas notifications to Ranger are missing propagated classifications
- When an entity was updated or created, Atlas correctly propagates classifications from the parent table or tables to the new entity. However, when Atlas notifies Ranger of the new table, the notification does not include the propagated classification. If Ranger includes a tag-based access policy that corresponds to the Atlas classification, the policy will not be applied to the new table. For example, if you marked a table with a classification to indicate that it had sensitive data (such as "PII"), then used fields from that table to create another table in a CTAS operation, Atlas propagates the PII classification from the parent table to the new table. The data Atlas sends to Ranger does not have the propagated "PII" classification, and therefore Ranger does not apply the tag-based access policy to the table.
- Incorrect attribute values in bulk import
- When importing Business Metadata attribute assignments, Atlas used
only the last assigned attribute value instead of individual values
for each entity in the import list. For example, setting Business
Metadata attributes on entities as shown results in all entities to
have the last value for the attributes:
Processing.owner="FIN-admin"
andProcessing.track="standard"
.EntityType,EntityUniqueAttributeValue,BusinessAttributeName,BusinessAttributeValue,EntityUniqueAttributeName[optional] Table,customer_dim@cl1,Processing.owner,"IT-admin" Table,customer_dim@cl1,Processing.track,"PII" Table,log_fact_daily_mv@cl1,Processing.owner,"IT-admin" Table,log_fact_daily_mv@cl1,Processing.track,"daily" Table,time_dim@cl1,Processing.owner,"FIN-admin" Table,time_dim@cl1,Processing.track,"standard"
- Migration progress bar not refreshed
- During the import stage of Cloudera Navigator to Apache Atlas migration, the migration progress bar does not correctly refresh the migration status. The Statistics page in the Atlas UI displays the correct details of the migration.
- Simultaneous events on the Kafka topic queue can produce duplicate Atlas entities
- In normal operation, Atlas receives metadata to create entities from multiple services on the same or separate Kafka topics. In some instances, such as for Spark jobs, metadata to create a table entity in Atlas is triggered from two separate messages: one for the Spark operation and a second for the table metadata from HMS. If the process metadata arrives before the table metadata, Atlas creates a temporary entity for any tables that are not already in Atlas and reconciles the temporary entity with the HMS metadata when the table metadata arrives.
- Deleted Business Metadata attributes appear in Search Suggestions
- Atlas search suggestions continue to show Business Metadata attributes even if the attributes have been deleted.
- Suggestion order doesn't match search weights
- At this time, the order of search suggestions does not honor the search weight for attributes.
- Hive Default Database Location Incorrect in Atlas Metadata
- The location of the default Hive database as reported through the HMS-Atlas plugin does not match the actual location of the database. This problem does not affect non-default databases.
- Unexpected Search Results When Using Regular Expressions in Basic Searches on Classifications
- When you include a regular expression or wildcard in the search criteria for a classification in the Basic Search, the results may differ unexpectedly from when full classification names are included. For example, the Exclude sub-classifications option is respected when using a full classification name as the search criteria; when using part of the classification name and the wildcard (*) with Exclude sub-classifications turned off, entities marked with sub-classifications are not included in the results. Other instances of unexpected results include case-sensitivity.
- Spark metadata order may affect lineage
- Atlas may record unexpected lineage relationships when metadata collection from the Spark Atlas Connector occurs out of sequence from metadata collection from HMS. For example, if an ALTER TABLE operation in Spark changing a table name and is reported to Atlas before HMS has processed the change, Atlas may not show the correct lineage relationships to the altered table.
- Searches for Qualified Names with "@" doesn't fetch the correct results
- When searching Atlas qualifiedName values that include an "at" character (@), Atlas does not return the expected results or generate appropriate search suggestions.
- Missing Impala and Spark lineage between tables and their data files
- Atlas does not create lineage between Hive tables and their backing HDFS files for CTAS processes run in Impala or Spark.
- Table alias values are not found in search
- When table names are changed, Atlas keeps the old name of the table in a list of aliases. These values are not included in the search index in this release, so after a table name is changed, searching on the old table name will not return the entity for the table.
- Hive lineage missing for INSERT OVERWRITE queries
- Lineage is not generated for Hive INSERT OVERWRITE queries on partitioned tables. Lineage is generated as expected for CTAS queries from partitioned tables.
- Logging out of Atlas does not manage the external authentication
- At this time, Atlas does not communicate a log-out event with the external authentication management, Apache Knox. When you log out of Atlas, you can still open the instance of Atlas from the same web browser without re-authentication.
- Ranking of top results in free-text search not intuitive
- The Free-text search feature ranks results based on which attributes match the search criteria. The attribute ranking is evolving and therefore the choice of top results may not be intuitive in this release.
- Free text search in Atlas is case sensitive
- The free text search bar in the top of the screen allows you to search across entity types and through all text attributes for all entities. The search shows the top 5 results that match the search terms at any place in the text (*term* logic). It also shows suggestions that match the search terms that begin with the term (term* logic). However, in this release, the search results are case-sensitive.
- Queries with ? wildcard return unexpected results
- DSL queries in Advanced Search return incorrect results when the query text includes a question mark (?) wildcard character. This problem occurs in environments where trusted proxy for Knox is enabled, which is always the case for CDP.
- Guest users are redirected incorrectly
- Authenticated users logging in to Atlas are redirected to the CDP Knox-based login page. However, if a guest user (without Atlas privileges) attempts to log in to Atlas, the user is redirected instead to the Atlas login page.
- IsUnique relationship attribute not honored
- The Atlas model includes the ability to ensure that an attribute can be set to a specific value in only one relationship entity across the cluster metadata. For example, if you wanted to add metadata tags to relationships that you wanted to make sure were unique in the system, you could design the relationship attribute with the property "IsUnique" equal true. However, in this release, the IsUnique attribute is not enforced.
- All Spark Queries from the Same Spark Session are Included in a Single Atlas Process
- A Spark session can include multiple queries. When Atlas reports the Spark metadata, it creates a single process entity to correspond to the Spark session. The result is that an Atlas lineage picture may show multiple input entities or multiple output entities from a process that are only related by the fact that they were included in operations in the same Spark session. The consequence of this behavior is that classifications will be propagated from any input entity to all output entities, even if the output entities aren't derived from the input entity.