Known issues

Learn about the known issues in Data Catalog, the impact or changes to the functionality, and the workaround.

Known Issues

CDPDSS-1953: Location field does not load absolute path for entities in contents tab
Problem:After navigating to the Azure Blob which is created within the Azure directory and later clicking the "Content" tab, the path for Azure blob is displayed as "/" instead of the path that mentions: containers/directory/Blob.
CDPDSS-1956: Issue with count on search page as AWS S3 V2 Object
Problem: The count against the Data Lake displays only the count of AWS S3 V2 Object entity type.
CDPDSS-31173: Atlas API on saving the classifications in Data Catalog is failing while using Cloudera Runtime version 7.2.12
Problem:When the suggested classification by Custom Sensitivity Profiler is saved, Atlas API throws an error response and it is not saved in Data Catalog. This behavior is specifically observed in Cloudera Runtime version 7.2.12.
Workaround: If you are using 7.2.12 version, from Cloudera Manager > Atlas > Configuration > search for 'application.prop' > enter the value as "atlas.tasks.enabled=false". Restart Atlas. Later, log into Data Catalog to use the classification feature.
CDPDSS-2047: Issue with Database connectivity while using Azure to launch Profilers in HA mode
Problem: Launching Profilers HA either as part of SDX or DataHub causes DB connectivity issue.
CDPDSS-2127: On profiler cluster creation step 'external database in creation' the message on search page is not displayed
Problem: The message on Data Catalog search page is not displayed in 'blue' section saying "profilers cluster is getting created" .
CDPDSS-2138: ODP fails on an asset imported from Glue to CDP on a medium duty cluster
Problem: The profiler run fails on the table.
CDPDSS-2142: Livy job in Profiler run has failed but profiling status of assets is shown as started
Problem: The livy job fails but each assets shows profiling status as started.
CDPD-32000: Atlas intermittently does not send the classification information to Data Catalog
Problem: On loading of search page filters with 'entity type' filter applied to 'Hive', intermittently the tags/classifications are not loaded in Data Catalog App. Due to this, Data Catalog does not load the filters for 'entity tag' and column tag.
CDPDSS-2145: Applying column tag filter and typing the same search result in search field gives ‘no results’ found
Problem: On searching the asset which is in the results for a column tag filter, the search results are shown as 'No results found'.
CDPDSS-2236: Issues noticed on data-catalog while performing scale testing
Problem:
  • Search page consistently throws 504 gateway time-out error, however on applying global search field or search filters, results are loaded.
  • Hive Column profiler goes to “undetermined” state for both ODP or scheduled jobs . To fix this, Profiler must be stopped and restarted. Happens due to Profiler not completing the read events from eventstore in AWS S3 within the time-out limit.(intermittent)
  • On Asset Details page, when ODP is triggered, on successful completion of jobs and refresh of page, the ODP widget is not displayed and user needs to refresh a couple of times to load it on UI (reproducible). This ODP trigger job is running in parallel to running scheduled jobs.
  • Assets cannot be added to dataset as the API fails with gateway timeout (intermittent)
Workaround: For the first and fourth issues listed, the following configuration change done on Cloudera Manager for Atlas:
atlas.graph.index.search.types.max-query-str-length=1536
Heap size = 4 GB
For the third issue listed, you must refresh the Asset Details page for about three to four times.