Configuring high availability for SHS with an internal load balancer
Learn how to configure high availability for Spark History Server (SHS) using an internal load balancer. The authentication method for the internal load balancer uses a username and password through Apache Knox Gateway. The Cloudera Distributed Hadoop (CDH) stack includes the Apache Knox Gateway which has a built-in load balancer and failover mechanism.
The
following Knox topology configuration is automatically generated if there are two
Spark History Server clusters in a Cloudera Manager cluster:
knox.example.com:/var/lib/knox/gateway/conf/topologies/cdp-proxy.xml
<service>
<role>SPARK3HISTORYUI</role>
<url>https://shs1.example.com:18489</url>
<url>https://shs2.example.com:18489</url>
</service>
To use the Knox load balancing feature,
you
must use the Knox Gateway URL. If one of the Spark History
Servers is down, the connection will be automatically redirected to the other
server. See the example Knox Gateway URLs below:
Spark3: https://knox.example.com:8443/gateway/cdp-proxy/spark3history/