Configuration Properties
Use the NameNode and data node properties to configure the NameNode and data nodes.
- dfs.namenode.name.dir
-
Specifies where on the local filesystem the DFS name node stores the name table (
fsimage
). If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy. - dfs.namenode.edits.dir
-
Specifies where on the local filesystem the DFS name node stores the transaction (
edits
) file. If this is a comma-delimited list of directories, the transaction file is replicated in all of the directories, for redundancy. The default value is set to the same value asdfs.namenode.name.dir
. - dfs.namenode.checkpoint.period
-
Specifies the number of seconds between two periodic checkpoints.
- dfs.namenode.checkpoint.txns
-
The standby creates a checkpoint of the namespace every
dfs.namenode.checkpoint.txns
transactions, regardless of whetherdfs.namenode.checkpoint.period
has expired. - dfs.namenode.checkpoint.check.period
-
Specifies how frequently to query for the number of un-checkpointed transactions.
- dfs.namenode.num.checkpoints.retained
-
Specifies the number of image checkpoint files to be retained in storage directories. All edit logs necessary to recover an up-to-date namespace from the oldest retained checkpoint are also retained.
- dfs.namenode.num.extra.edits.retained
-
Specifies the number of extra transactions which are retained beyond what is minimally necessary for a NN restart. This can be useful for audit purposes or for an HA setup where a remote Standby Node might have been offline and need to have a longer backlog of retained
edits
to start again. - dfs.namenode.edit.log.autoroll.multiplier.threshold
-
Specifies when an active namenode rolls its own edit log. The actual threshold (in number of
edits
) is determined by multiplying this value by dfs.namenode.checkpoint.txns. This prevents extremely large edit files from accumulating on the active namenode, which can cause timeouts during namenode start-up and pose an administrative hassle. This behavior is intended as a fail-safe for when the standby fails to roll the edit log by the normal checkpoint threshold. - dfs.namenode.edit.log.autoroll.check.interval.ms
-
Specifies the time in milliseconds that an active namenode checks if it needs to roll its edit log.
- dfs.datanode.data.dir
-
Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data is stored in all named directories, typically on different devices. Directories that do not exist are ignored. Heterogeneous storage allows specifying that each directory resides on a different type of storage:
DISK
,SSD
,ARCHIVE
orRAM_DISK
.