Configure DataNode memory as storage

Configuring memory on a DataNode as storage requires you to shut down the particular DataNode, set RAM_DISK as the storage type, set the LAZY_PERSIST storage policy to store data, and then start the DataNode.

  1. Shut down the DataNode.
  2. Use required mount commands to allocate a certain portion of the DataNode memory as storage.
    The following example shows how you can allocate 2GB memory for use by HDFS.
    
    sudo mkdir -p /mnt/hdfsramdisk
    sudo mount -t tmpfs -o size=2048m tmpfs /mnt/hdfsramdisk
    sudo mkdir -p /usr/lib/hadoop-hdfs
  3. Assign the RAM_DISK storage type to ensure that HDFS can assign data to the DataNode memory configured as storage.
    To specify the DataNode as RAM_DISK storage, insert [RAM_DISK] at the beginning of the local file system mount path and add it to the dfs.name.dir property in hdfs-default.xml.
    The following example shows the updated mount path values for dfs.datanode.data.dir
    
    <property>
      <name>dfs.datanode.data.dir</name>
      <value>file:///grid/3/aa/hdfs/data/,[RAM_DISK]file:///mnt/hdfsramdisk/</value>
    </property>
    
  4. Set the LAZY_PERSIST storage policy to store data on the configured DataNode memory.
    The following example shows how you can use the hdfs dfsadmin -getStoragepolicy command to configure the LAZY_PERSIST storage policy:
    hdfs dfsadmin -getStoragePolicy /memory1 LAZY_PERSIST 
  5. Start the DataNode.
  6. Use the HDFS mover tool to move data blocks according to the specified storage policy.
    The HDFS mover data migration tool scans the specified files in HDFS and verifies if the block placement satisfies the storage policy. For the blocks that violate the storage policy, the tool moves the replicas to a different storage type in order to fulfill the storage policy requirements.