JBOD Operational Procedures
Monitoring
- Replication Status
- Monitor replication status using Cloudera Manager Health Tests. Cloudera Manager automatically and continuously monitors both the OfflineLogDirectoryCount and OfflineReplicaCount metrics. Alters are raised when failures are detected. For more information, see Cloudera Manager Health Tests.
- Disk Capacity
- Monitor free space on mounted disks and open file descriptors. For more information, see Useful Shell Command Reference. Reassign partitions or move log files around if necessary. For more information, see kafka-reassign-partitions.
Handling Disk Failures
- In Cloudera Manager go to the Kafka service, select and select the broker.
- Go to .
- Replace the faulty disk with a new one.
- Remove the disk and redistribute data across remaining disks to restore the desired replication factor.
Disk Replacement
- Stop the broker that has a faulty disk.
- In Cloudera Manager, go to the Kafka service, select and select the broker.
- Go to .
- Replace the disk.
- Mount the disk.
- Set up the directory structure on the new disk the same way as it was set up on the previous disk.
- Start the broker.
- In Cloudera Manager go to the Kafka service, select and select the broker.
- Go to .
The Kafka broker re-creates topic partitions in the same directory by replicating data from other brokers.
Disk Removal
- Stop the broker that has a faulty disk.
- In Cloudera Manager, go to the Kafka service, select and select the broker.
- Go to .
- Remove the log directories on the faulty disk from the broker.
- Go to and find the property.
- Remove the affected log directories with the Remove button.
- Enter a Reason for change, and then click Save Changes to commit the changes.
- Start the broker.
- In Cloudera Manager go to the Kafka service, select and select the broker.
- Go to .
The Kafka broker redistributes data across the cluster.
Reassigning Replicas Between Log Directories
Reassigning replicas between log directories can prove useful when you have multiple disks available, but one or more of them is nearing capacity. Moving a replica from one disk to another ensures that the service will not go down due to disks reaching capacity. To balance storage loads, the Kafka administrator has to continuously monitor the system and reassign replicas between log directories on the same broker or across different brokers. These actions can be carried out with the kafka-reassign-partitions tool.
For more information on tool usage, see the documentation for the kafka-reassign-partitions tool.
Retrieving Log Directory Replica Assignment Information
To optimize replica assignment across log directories, the list of partitions per log directory and the size of each partition is required. This information can be exposed with the kafka-log-dirs tool.
For more information on tool usage, see the documentation for the kafka-log-dirs tool.