HDFS entity metadata migration

HDFS metadata entities are migrated from Navigator to Atlas when they appear in a lineage relationship from Hive, Impala, or Spark processes.

The following sections describe how metadata is mapped from Navigator to Atlas; if Atlas requires metadata that wasn't available in Navigator, the migration notes describe how the Atlas metadata values are generated.

Migrated entities include: For entity metadata that is common to all entities, see System metadata migration.

HDFS Directory

Navigator fselement entities of type=DIRECTORY are migrated to Atlas hdfs_path entities with the isFile attribute set to false.

Navigator Metadata Atlas Metadata Migration Notes
blockSize Null for directories.
created attributes.createTime Converted to date type.
ezKeyName Null for directories.
fileSystemPath attributes.path
group attributes.group
lastAccessed attributes.modifiedTime Converted to date type.
mimeType Null for directories.
owner attributes.owner
permissions attributes.posixPermissions Converted to Atlas values.
replication attributes.numberOfReplicas Null for directories.
size attributes.fileSize Null for directories.
type attributes.isFile The Navigator type=DIRECTORY property is converted to isFile=FALSE.
attributes.isSymLink Defaults to FALSE.
attributes.nameServiceId Optional in Atlas.
attributes.qualifiedName Generated as a string in the format <path>@clustername.

HDFS File

Navigator fselement entities of type=FILE are migrated to Atlas hdfs_path entities with the isFile attribute set to true.

Navigator Metadata Atlas Metadata Migration Notes
blockSize Added to the Atlas entity custom attributes as a key value pair with the Navigator name as the key.
created attributes.createTime Converted to date type.
ezKeyName Added to the Atlas entity custom attributes as a key value pair with the Navigator name as the key.
fileSystemPath attributes.path
group attributes.group
lastAccessed attributes.modifiedTime Converted to date type.
mimeType Added to the Atlas entity custom attributes as a key value pair with the Navigator name as the key.
owner attributes.owner
parentPath attributes.extendedAttributes
permissions attributes.posixPermissions Converted to Atlas values.
replication attributes.numberOfReplicas
size attributes.fileSize
type attributes.isFile The Navigator type=FILE property is converted to isFile=TRUE.
attributes.isSymLink Defaults to FALSE.
attributes.nameServiceId Optional in Atlas.
attributes.qualifiedName Generated as a string in the format <path>@clustername.