Using Spark History Servers with high availability
You can configure the load balancer for Spark History Server (SHS) to ensure high availability, so that users can access and use the Spark History Server UI without any disruption. Learn how you can configure the load balancer for SHS and the limitations associated with it.
You can access the Spark History Server for your Spark cluster from the Cloudera Data
Platform (CDP) Management Console interface. The Spark History Server (SHS) has two main
functions:
- Reads the Spark event logs from the storage and displays them on the Spark History Server's user interface.
- Cleans the old Spark event log files.
There are three supported ways to configure the load balancer for Spark History Server:
- Using an external load balancer for example, HAProxy.
- Using an internal load balancer which requires Apache Knox Gateway.
- Using multiple Apache Knox Gateways and external load balancers, for example, HAProxy.
.