Recover from full disks
By default, Kudu reserves a small amount of space, 1% by capacity, in its directories. Kudu considers a disk full if there is less free space available than the reservation. Kudu nodes can only tolerate running out of space on disks on which certain Kudu directories are mounted.
The following table describes this behavior for each type of directory. The behavior is uniform across masters and tablet servers.
Kudu Directory Type | Crash on Full Disk? |
---|---|
Directory containing WALs | Yes |
Directory containing tablet metadata | Yes |
Directory containing data blocks only | No (see below) |
Prior to Kudu 1.7.0, Kudu stripes tablet data across all directories, and will avoid writing data to full directories. Kudu will crash if all data directories are full.
In 1.7.0 and later, new tablets are assigned a disk group consisting of data
directories. The number of data directories are as specified by the
-fs_target_data_dirs_per_tablet
flag with the default being
3
. If Kudu is not configured with enough data directories for a full
disk group, all data directories are used. When a data directory is full, Kudu will
stop writing new data to it and each tablet that uses that data directory will write
new data to other data directories within its group. If all data directories for a
tablet are full, Kudu will crash. Periodically, Kudu will check if full data
directories are still full, and will resume writing to those data directories if space
has become available.
If Kudu does crash because its data directories are full, freeing space on the full directories will allow the affected daemon to restart and resume writing. Note that it may be possible for Kudu to free some space by running:
$ sudo -u kudu kudu fs check --repair
However, the above command may also fail if there is too little space left.
It is also possible to allocate additional data directories to Kudu in order to increase the overall amount of storage available. Note that existing tablets will not use new data directories, so adding a new data directory does not resolve issues with full disks.