Scaling Namespaces and Optimizing Data Storage
Also available as:
PDF
loading table of contents...

Configure DataNode memory as storage

Configuring memory on a DataNode as storage requires you to shut down the particular DataNode, set RAM_DISK as the storage type, set the LAZY_PERSIST storage policy to store data, and then start the DataNode.

  1. Shut down the DataNode.
  2. Use required mount commands to allocate a certain portion of the DataNode memory as storage.
    The following example shows how you can allocate 2GB memory for use by HDFS.
    
    sudo mkdir -p /mnt/hdfsramdisk
    sudo mount -t tmpfs -o size=2048m tmpfs /mnt/hdfsramdisk
    sudo mkdir -p /usr/lib/hadoop-hdfs
  3. Assign the RAM_DISK storage type to ensure that HDFS can assign data to the DataNode memory configured as storage.
    To specify the DataNode as RAM_DISK storage, insert [RAM_DISK] at the beginning of the local file system mount path and add it to the dfs.name.dir property in hdfs-default.xml.
    The following example shows the updated mount path values for dfs.datanode.data.dir
    
    <property>
      <name>dfs.datanode.data.dir</name>
      <value>file:///grid/3/aa/hdfs/data/,[RAM_DISK]file:///mnt/hdfsramdisk/</value>
    </property>
    
  4. Set the LAZY_PERSIST storage policy to store data on the configured DataNode memory.
    The following example shows how you can use the hdfs dfsadmin -getStoragepolicy command to configure the LAZY_PERSIST storage policy:
    hdfs dfsadmin -getStoragePolicy /memory1 LAZY_PERSIST 
    Note
    Note
    When you update a storage policy setting on a file or directory, the new policy is not automatically enforced. You must use the HDFS mover data migration tool to actually move blocks as specified by the new storage policy.
  5. Start the DataNode.
  6. Use the HDFS mover tool to move data blocks according to the specified storage policy.
    The HDFS mover data migration tool scans the specified files in HDFS and verifies if the block placement satisfies the storage policy. For the blocks that violate the storage policy, the tool moves the replicas to a different storage type in order to fulfill the storage policy requirements.