Limitation for Spark History Server with high availability
You must be aware of a limitation related to how Spark History Server with high availability operates in Cloudera Manager.
- The second Spark History Server in the cluster does not clean the Spark event logs. Cleaning Spark event
logs is automatically disabled from the Custom Service Descriptor. The second server can
only read logs. This limitation ensures that two Spark History Servers do not try to delete the same files.
If the first Spark History Server is down, the second one does not take over the cleaner task. This is not
a critical issue because if the first Spark History Server starts again, it will delete those old Spark
event logs. The default event log cleaner interval (
spark.history.fs.cleaner.interval
) is 1 day in Cloudera Manager which means that the first Spark History Server only deletes the old logs once per day by default.