Apache Atlas provides data governance capabilities for Hadoop. Apache Atlas serves as a common metadata store that is designed to exchange metadata both within and outside of the Hadoop stack. Close integration of Atlas with Apache Ranger enables you to define, administer, and manage security and compliance policies consistently across all components of the Hadoop stack. Atlas provides metadata and lineage to Data Steward Studio to support curating data across enterprise data.
Use the metadata Atlas collects and metadata you add to effectively find entities.
Lineage offers insight into where data came from and how to determine the impact of changes to data assets.
How to use Apache Atlas to search for, annotate, classify, and manage data.
Business Metadata allows you to extend the model that represent a given asset type in Atlas. Sets of business metadata can be authorized independently through Ranger so you can manage who has the ability to update which business metadata attributes.
Collecting your organization's terms in Atlas helps you build a search index to easily find the data assets you are looking for.
Configure Atlas High Availability (HA) for your clusters.
Configure Atlas' extractors, monitor status, and access logs using Cloudera Manager.
Configure Atlas' authentication and authorization through Cloudera Manager and using access policies in Apache Ranger. With CDP Cloud, authentication is configured for you using Free IPA; you'll still want to review and customize Atlas policies in Ranger to meet your organization's requirements.
When upgrading a cluster from CDH to CDP, you can choose to move your Navigator Data Management metadata into Atlas.
Explains about the Atlas metadata extractor for S3, which you can run on an Atlas host to provide comprehensive metadata for data assets stored in S3.
Explains about the Atlas metadata extractor for ADLS, which you can run on an Atlas host to provide comprehensive metadata for data assets stored in ADLS.