Apache Atlas Advanced Search language reference
Atlas lets you search for metadata using a domain-specific language with a SQL-like format.
If you find that the Basic Search or Free-text Search doesn't allow you to search as precisely as you would like, you can create a query in the Advanced Search interface to return exactly the results you are looking for. Advanced Search queries use a domain-specific language that is intentionally SQL-like.
Each Advanced Search query is in the form of three clauses:
FROM WHERE SELECT
Additional keywords such as GROUPBY, ORDERBY, and LIMIT can be used to affect the output.
The value specified in the FROM clause acts as the scope of the query. You can specify any entity type in the FROM clause. The possible entity types are the same list as in the Type search; the names are case-sensitive.
The FROM clause is required and also assumed: the first item included in the query (if not literally the word "from") is assumed to be the object of the FROM clause.
With or without FROM: To retrieve all entities of type "hive_db" use one of the following queries:
hive_db from hive_db
If you only specify a FROM clause, Atlas returns all entities of that type.
The WHERE clause allows for filtering over the result set identified in the FROM clause by specifying a condition of the form:
identifier operator 'literal'
The identifier is the name of a property of the entity type specified in the FROM clause. The properties for a given entity type are those shown in the Properties tab of an entity detail page. The names are case-sensitive.
Operators vary by the data type of the literal and include the following:
String: = LIKE
Numeric, Date: = < >
The LIKE operator allows you to use wildcards in the literal. Asterisk (*) replaces zero to multiple values; question mark (?) replaces a single value.
The literal must be enclosed in single or double quotes. Matches are case-sensitive. Literals can be lists of values. If you specify comma-separated values in square brackets, they act as an OR operation.
Dates used in literals need to be specified using the ISO 8601 format and in single or double quotes.
Boolean values used in literals are lower case "true" and "false" without quotation marks.
You can specify multiple conditions using AND or OR operators. Note that making a list of values is more efficient than using the same identifier in multiple conditions.
Exact string: To retrieve all entities of type hive_table with a specific name "time_dim", use:
from hive_table where name = 'time_dim'
Multiple conditions: To retrieve entity of type hive_table with name that can be either "time_dim" or "customer_dim":
from hive_table where name 'time_dim' or name = 'customer_dim'
List of values: The query in the example above can be written using a value array:
from hive_table where name = ["customer_dim", "time_dim"]
Wildcard filtering: To retrieve entity of type hive_table whose name ends with '_dim':
from hive_table where name LIKE '*_dim'
To retrieve a hive_db whose name starts with R followed by any 3 characters, followed by rt followed by at least 1 character, followed by none or any number of characters:
DB where name like "R???rt?*"
Date Literal: To retrieve entity of type hive_table created within 2019 and 2020, use the date portion of the time value and specify a range using two phrases connected by AND:
from hive_table where createTime > '2019-01-01' and createTime < '2019-01-03'
Boolean Literal: To retrieve entity of type hdfs_path whose attribute isFile is set to true and whose name is Invoice:
from hdfs_path where isFile = true and name = "Invoice"
The select clause allows you to specify the properties you want returned in the search results. Properties with simple values can be returned; properties that contain other entities are not available. The property names are case sensitive.
To display column headers that are more meaningful that the system property names, you can use aliases using 'as.'
Select clause only: To retrieve entities of type "hive_table" with some of its properties:
from hive_table select owner, name, qualifiedName
WHERE and SELECT clauses: To retrieve entity of type hive_table for a specific table with some properties:
from hive_table where name = 'customer_dim' select owner, name, qualifiedName
Change output names using AS: To display column headers as 'Owner', 'Name' and 'FullName'.
from hive_table select owner as Owner, name as Name, qualifiedName as FullName
Advanced Searches using Classifications
You can search for entities that are tagged with a specific classification using "is" or "isa" keywords in either the From or Where clauses. Is and Isa are interchangeable.
FROM or WHERE clause: To retrieve all entities of type "hive_table" that are tagged with the "Dimension" classification, you could use any of the following queries:
hive_table is Dimension from hive_table where hive_table isa Dimension