Metadata Search Syntax and Properties

Search in the Navigator Metadata component is implemented by an embedded Solr engine that supports the syntax described in LuceneQParserPlugin.

Search Syntax

You construct search strings by specifying the value of a default property, property name-value pairs, or user-defined name-value pairs using the syntax:

  • Property name-value pairs - propertyName:value, where
    • propertyName is one of the properties listed in Search Properties.
    • value is a single value or range of values specified as [value1 TO value2]. In a value, * is a wildcard. In property values you must escape special characters :, -, /, and * with the backslash character \ or enclose the property value in quotes. For example, fileSystemPath:/tmp/hbase\-staging.
  • User-defined name-value pairs - up_propertyName:value.

To construct complex strings, join multiple property-value pairs using the or and and operators.

Example Search Strings

  • Filesystem path /user/admin - fileSystemPath:\/user\/admin
  • Descriptions that start with the string "Banking" - description:Banking*
  • Sources of type MapReduce or Hive - sourceType:MAPREDUCE or sourceType:HIVE
  • Directories owned by hdfs in the path /user/hdfs/input - owner:HDFS and type:directory and fileSystemPath:\/user\/hdfs\/input
  • Job started between 20:00 to 21:00 UTC - started:[2013-10-21T20:00:00.000Z TO 2013-10-21T21:00:00.000Z]
  • User-defined key-value project-customer1 - up_project:customer1

Search Properties

Default Properties

The following properties can be searched by simply specifying a property value: type, fileSystemPath, inputs, jobId, mapper, mimeType, name, originalName, outputs, owner, principal, reducer, tags.

Common Properties

Name Type Description
description text Description of the entity.
group caseInsensitiveText The group to which the owner of the entity belongs.
name ngramedText The overridden name of the entity. If the name has not been overridden, this value is empty. Names cannot contain spaces.
operationType ngramedText The type of an operation:
  • Pig - SCRIPT
  • Sqoop - Table Export, Query Import
originalName ngramedText The name of the entity when it was extracted.
originalDescription text The description of the entity when it was extracted.
owner caseInsensitiveText The owner of the entity.
principal caseInsensitiveText For entities with type OPERATION_EXECUTION, the initiator of the entity.
tags ngramedText A set of tags that describe the entity.
type ngramedText The type of the entity. The available types depend on the entity's source type:
  • HDFS - DIRECTORY, FILE
  • HIVE - DATABASE, TABLE, FIELD, OPERATION, OPERATION_EXECUTION, SUB_OPERATION, PARTITION, RESOURCE, UNKNOWN, VIEW
  • MAPREDUCE - OPERATION, OPERATION_EXECUTION
  • OOZIE - OPERATION, OPERATION_EXECUTION
  • PIG - OPERATION, OPERATION_EXECUTION
  • SQOOP - OPERATION, OPERATION_EXECUTION, SUB_OPERATION
  • YARN - OPERATION, OPERATION_EXECUTION
Query
queryText string The text of a Hive or Sqoop query.
Source
clusterName string The name of the cluster in which the entity is stored.
sourceId string The ID of the source type.
sourceType caseInsensitiveText The source type of the entity: HDFS, HIVE, MAPREDUCE, OOZIE, PIG, SQOOP, YARN.
sourceUrl string The URL of the source type.
Timestamps
The available timestamp fields vary by the source type:
  • HDFS - lastModified, lastAccessed
  • HIVE - created, lastAccessed
  • MAPREDUCE, PIG, SQOOP, and YARN - started, ended
date Timestamps in the Solr Date Format. For example:
  • lastAccessed:[* TO NOW]
  • created:[1976-03-06T23:59:59.999Z TO *]
  • started:[1995-12-31T23:59:59.999Z TO 2007-03-06T00:00:00Z]
  • ended:[NOW-1YEAR/DAY TO NOW/DAY+1DAY]
  • created:[1976-03-06T23:59:59.999Z TO 1976-03-06T23:59:59.999Z+1YEAR]
  • lastAccessed:[1976-03-06T23:59:59.999Z/YEAR TO 1976-03-06T23:59:59.999Z]

HDFS Properties

Name Type Description
fileSystemPath path The path to the entity.
compressed Boolean Indicates whether the entity is compressed.
deleted Boolean Indicates whether the entity has been moved to the Trash folder.
deleteTime date The time the entity was moved to the Trash folder.
mimeType ngramedText The MIME type of the entity.
parentPath string The path to the parent entity of a child entity. For example: parent path:/default/sample_07 for the table sample_07 from the Hive database default.
permissions string The UNIX access permissions of the entity.
size long The exact size of the entity in bytes or a range of sizes. Range examples: size:[1000 TO *], size: [* TO 2000], and size:[* TO *] to find all fields with a size value.

MAPREDUCE and YARN Properties

Name Type Description
inputRecursive Boolean Indicates whether files are searched recursively under the input directories, or just files directly under the input directories are considered.
jobId ngramedText The ID of the job. For a job spawned by Oozie, the workflow ID.
mapper string The fully-qualified name of the mapper class.
outputKey string The fully-qualified name of the class of the output key.
outputValue string The fully-qualified name of the class of the output value.
reducer string The fully-qualified name of the reducer class.

OPERATION Properties

Name Type Description
Operation
inputFormat string The fully-qualified name of the class of the input format.
outputFormat string The fully-qualified name of the class of the output format.
Operation Execution
inputs string The name of the entity input to an operation execution. For entities of resource type MAPREDUCE, it is usually a directory. For entities of resource type Hive, it is usually a table.
outputs string The name of the entity output from an operation execution. For entities of resource type MAPREDUCE, it is usually a directory. For entities of resource type Hive, it is usually a table.

HIVE Properties

Name Type Description
Field
dataType ngramedText The type of data stored in a field (column).
Table
compressed Boolean Indicates whether a Hive table is compressed.
serDeLibName string The name of the library containing the SerDe class.
serDeName string The fully-qualified name of the SerDe class.
Partition
partitionColNames string The table columns that define the partition.
partitionColValues string The table column values that define the partition.

Oozie Properties

Name Type Description
status string The status of the Oozie workflow: RUNNING, SUCCEEDED, or FAILED.

PIG Properties

Name Type Description
scriptId string The ID of the Pig script.

SQOOP Properties

Name Type Description
dbURL string The URL of the database from or to which the data was imported or exported.
dbTable string The table from or to which the data was imported or exported.
dbUser string The database user.
dbWhere string A where clause that identifies which rows were imported.
dbColumnExpression string An expression that identifies which columns were imported.