Understanding Datasets
A Dataset is a group of assets that fit search criteria so that you can manage and administer them collectively.
Asset collections enable you to perform the following tasks when working with your data:
-
Organize
Group data assets into Datasets based on business classifications, purpose, protections, relevance, etc.
-
Search
Find tags or assets in your data lake using Hive assets, attribute facets, or free text.
Advanced asset search uses facets of technical and business metadata about the assets, such as those captured in Apache Atlas, to help users define and build collections of interest. Advanced search conditions are a subset of attributes for the Apache Atlas type hive_table.
-
Understand
Audit data asset security and use for anomaly detection, forensic audit and compliance, and proper control mechanisms.
You can edit Datasets after you create them and the assets contained within the collection will be updated. CRUD (Create, Read, Update, Delete) is supported for Datasets.