Load balancing for Apache Knox

Knox offers load balancing using a simple round robin algorithm which prevents load on one specific node.

  • For services that are stateless, Knox loadbalances them using a simple round robin algorithm which prevents load on one specific node.
  • For services that are stateful (i.e., require sessions, such as Ranger and Hive,) sessions are loadbalanced using a round robin algorithm, where each new session will use a different host and all the requests in the same session will be routed to the same host. This will continue until a session terminates or there is a failover.
  • In case of failover, services that are stateful will return error response 502.
This behavior is configurable and can be changed by tuning various flags in Knox HA provider for the respective services.

Load balancing vs high availability (HA)

Currently, Knox offers load balancing using a simple round robin algorithm which prevents load on one specific node.

Because we do not support session persistence, this is not true HA, as there could be a case where stateful service will not failover to other node.

Supported services

The following services support Knox load balancing in the Public cloud:
  • Hive
  • Phoenix
  • Ranger
  • Solr

Default enabled values

The following default values are enabled in the Knox topology. API is located in cdp-proxy-api.xml; UI is located in cdp-proxy.xml.
  • Hive
    • API: enableStickySession=true;noFallback=true;enableLoadBalancing=true
  • Phoenix
    • API: enableStickySession=true;noFallback=true;enableLoadBalancing=true
  • Ranger
    • API: enableStickySession=false;noFallback=false;enableLoadBalancing=true
    • UI: enableStickySession=true;noFallback=true;enableLoadBalancing=true
  • Solr
    • API: enableStickySession=false;noFallback=false;enableLoadBalancing=true
    • UI: enableStickySession=true;noFallback=true;enableLoadBalancing=true