HDFS Administration Guide
Also available as:
PDF

Configuring Memory as Storage

Use the following steps to configure DataNode memory as storage:

1. Shut Down the DataNode

Shut down the DataNode using the applicable commands in the Controlling HDP Services Manually section of the HDP Reference Guide.

2. Mount a Portion of DataNode Memory for HDFS

To use DataNode memory as storage, you must first mount a portion of the DataNode memory for use by HDFS.

For example, you would use the following commands to allocate 2GB of memory for HDFS storage:

sudo mkdir -p /mnt/hdfsramdisk
sudo mount -t tmpfs -o size=2048m tmpfs /mnt/hdfsramdisk
Sudo mkdir -p /usr/lib/hadoop-hdfs

3. Assign the RAM_DISK Storage Type and Enable Short-Circuit Reads

Edit the following properties in the /etc/hadoop/conf/hdfs-site.xml file to assign the RAM_DISK storage type to DataNodes and enable short-circuit reads.

  • The dfs.name.dir property determines where on the local filesystem a DataNode should store its blocks. To specify a DataNode as RAM_DISK storage, insert [RAM_DISK] at the beginning of the local file system mount path and add it to the dfs.name.dir property.

  • To enable short-circuit reads, set the value of dfs.client.read.shortcircuit to true.

For example:

 <property>
 <name>dfs.data.dir</name>
 <value>file:///grid/3/aa/hdfs/data/,[RAM_DISK]file:///mnt/hdfsramdisk/</value>
 </property>
 
 <property>
 <name>dfs.client.read.shortcircuit</name>
 <value>true</value>
 </property>
 
 <property>
 <name>dfs.domain.socket.path</name>
 <value>/var/lib/hadoop-hdfs/dn_socket</value>
 </property>
 
 <property>
 <name>dfs.checksum.type</name>
 <value>NULL</value>
 </property>

4. Set the LAZY_PERSIST Storage Policy on Files or Directories

Set a storage policy on a file or a directory.

Command:

hdfs dfsadmin -setStoragePolicy <path> <policyName>

Arguments:

  • <path> - The path to a directory or file.

  • <policyName> - The name of the storage policy.

Example:

hdfs dfsadmin -setStoragePolicy /memory1 LAZY_PERSIST 

Get the storage policy of a file or a directory.

Command:

hdfs dfsadmin -getStoragePolicy <path>

Arguments:

  • <path> - The path to a directory or file.

Example:

hdfs dfsadmin -getStoragePolicy /memory1 LAZY_PERSIST 

5. Start the DataNode

Start the DataNode using the applicable commands in the Controlling HDP Services Manually section of the HDP Reference Guide.

Using Mover to Apply Storage Policies

When you update a storage policy setting on a file or directory, the new policy is not automatically enforced. You must use the HDFS mover data migration tool to actually move blocks as specified by the new storage policy.

The mover data migration tool scans the specified files in HDFS and checks to see if the block placement satisfies the storage policy. For the blocks that violate the storage policy, it moves the replicas to the applicable storage type in order to fulfill the storage policy requirements.

Command:

hdfs mover [-p <files/dirs> | -f <local file name>] 

Arguments:

  • -p<files/dirs> - Specify a space-separated list of HDFS files/directories to migrate.

  • -f<local file> - Specify a local file list containing a list of HDFS files/directories to migrate.

[Note]Note

When both -p and -f options are omitted, the default path is the root directory.