About Metadata Search
Search in the Navigator Metadata component is implemented by an embedded Solr engine that supports the syntax described in LuceneQParserPlugin.
Search Syntax
You construct search strings by specifying the value of a default property, property name-value pairs, or user-defined name-value pairs using the syntax:
- Property name-value pairs - propertyName:value, where
- propertyName is one of the properties listed in Search Properties.
- value is a single value or range of values specified as [value1 TO value2]. In a value, * is a wildcard. In property name-value pairs you must escape special characters :, /, and * with the backslash character \. For example, fileSystemPath:\/user\/admin.
- User-defined name-value pairs - up_propertyName:value.
To construct complex strings, join multiple property-value pairs using the or and and operators.
Example Search Strings
- Filesystem path /user/admin - fileSystemPath:\/user\/admin
- Descriptions that start with the string "Banking" - description:Banking*
- Sources of type MapReduce or Hive - sourceType:MAPREDUCE or sourceType:HIVE
- Directories owned by hdfs in the path /user/hdfs/input - owner:HDFS and type:directory and fileSystemPath:\/user\/hdfs\/input
- Job started between 20:00 to 21:00 UTC - started:[2013-10-21T20:00:00.000Z TO 2013-10-21T21:00:00.000Z]
- User-defined key-value project-customer1 - up_project:customer1
Search Properties
Default Properties
The following properties can be searched by simply specifying a property value: type, fileSystemPath, inputs, jobId, mapper, mimeType, name, originalName, outputs, owner, principal, reducer, tags.
Common Properties
Name | Type | Description |
---|---|---|
description | text | Description of the entity. |
group | caseInsensitiveText | The group to which the owner of the entity belongs. |
name | ngramedText | The overridden name of the entity. If the name has not been overridden, this value is empty. Names cannot contain spaces. |
operationType | ngramedText | The type of an operation:
|
originalName | ngramedText | The name of the entity when it was extracted. |
originalDescription | text | The description of the entity when it was extracted. |
owner | caseInsensitiveText | The owner of the entity. |
principal | caseInsensitiveText | For entities with type OPERATION_EXECUTION, the initiator of the entity. |
tags | ngramedText | A set of tags that describe the entity. |
type | ngramedText | The type of the entity. The available types depend on the entity's source type:
|
Query | ||
queryText | string | The text of a Hive or Sqoop query. |
Source | ||
clusterName | string | The name of the cluster in which the entity is stored. |
sourceId | string | The ID of the source type. |
sourceType | caseInsensitiveText | The source type of the entity: HDFS, HIVE, MAPREDUCE, OOZIE, PIG, SQOOP, YARN. |
sourceUrl | string | The URL of the source type. |
Timestamps | ||
The available timestamp fields vary by the source type:
|
date | Timestamps in the Solr Date Format. For example:
|
HDFS Properties
Name | Type | Description |
---|---|---|
fileSystemPath | path | The path to the entity. |
compressed | Boolean | Indicates whether the entity is compressed. |
deleted | Boolean | Indicates whether the entity has been moved to the Trash folder. |
deleteTime | date | The time the entity was moved to the Trash folder. |
mimeType | ngramedText | The MIME type of the entity. |
parentPath | string | The path to the parent entity for a child entity. For example: parent path:/default/sample_07 for the table sample_07 from the Hive database default. |
permissions | string | The UNIX access permissions of the entity. |
size | long | The exact size of the entity in bytes or a range of sizes. Range examples: size:[1000 TO *], size: [* TO 2000], and size:[* TO *] to find all fields with a size value. |
MAPREDUCE and YARN Properties
Name | Type | Description |
---|---|---|
inputRecursive | Boolean | Indicates whether files are searched recursively under the input directories, or just files directly under the input directories are considered. |
jobId | ngramedText | The ID of the job. For a job spawned by Oozie, the workflow ID. |
mapper | string | The fully-qualified name of the mapper class. |
outputKey | string | The fully-qualified name of the class of the output key. |
outputValue | string | The fully-qualified name of the class of the output value. |
reducer | string | The fully-qualified name of the reducer class. |
OPERATION Properties
Name | Type | Description |
---|---|---|
Operation | ||
inputFormat | string | The fully-qualified name of the class of the input format. |
outputFormat | string | The fully-qualified name of the class of the output format. |
Operation Execution | ||
inputs | string | The name of the entity input to an operation execution. For entities of resource type MR, it is usually a directory. For entities of resource type Hive, it is usually a table. |
outputs | string | The name of the entity output from an operation execution. For entities of resource type MR, it is usually a directory. For entities of resource type Hive, it is usually a table. |
HIVE Properties
Name | Type | Description |
---|---|---|
Field | ||
dataType | ngramedText | The type of data stored in a field (column). |
Table | ||
compressed | Boolean | Indicates whether a Hive table is compressed. |
serDeLibName | string | The name of the library containing the SerDe class. |
serDeName | string | The fully-qualified name of the SerDe class. |
Partition | ||
partitionColNames | string | The table columns that define the partition. |
partitionColValues | string | The table column values that define the partition. |
Oozie Properties
Name | Type | Description |
---|---|---|
status | string | The status of the Oozie workflow: RUNNING, SUCCEEDED, or FAILED. |
SQOOP Properties
Name | Type | Description |
---|---|---|
dbURL | string | The URL of the database from or to which the data was imported or exported. |
dbTable | string | The table from or to which the data was imported or exported. |
dbUser | string | The database user. |
dbWhere | string | The where clause that identifies which rows were imported. |
dbColumnExpression | string | An expression that identifies which columns were imported. |