Enabling Ring Fencing in Cloudera AI Workbench

The Ring Fencing feature is available from Cloudera AI 1.5.5 SP1 or higher releases. Ring fencing ensures that Cloudera AI infrastructure pods are exclusively scheduled on designated Cloudera AI nodes within the Kubernetes cluster.

In on premises environments, Cloudera AI infrastructure services share cluster resources with Kubernetes system components, other platform services, and user workloads. When CPU bursting is enabled, resource-intensive users or system workloads can consume additional resources, potentially preempting Cloudera AI services and making the application temporarily inaccessible.

Ring fencing ensures that Cloudera AI infrastructure pods are exclusively scheduled on designated Cloudera AI nodes within the Kubernetes cluster. This isolation is implemented through the following mechanisms:

  • Kubernetes taints and tolerations – used to prevent non-Cloudera AI workloads from being scheduled on dedicated nodes.

  • Node affinity – used to ensure Cloudera AI pods are scheduled only on the intended nodes.

Together, these mechanisms guarantee that Cloudera AI workloads are scheduled only on nodes dedicated to Cloudera AI.

To use ring fencing:
  • The cluster must be configured for ring fencing, ensuring a dedicated set of nodes is allocated for Cloudera AI infrastructure workloads.

    To enable ring fencing for one workbench, the nodes dedicated for Cloudera AI infrastructure must have the following minimum resource requirements:
    • A minimum of 32 CPU cores
    • A minimum of 60 GiB memory
  • The designated nodes must be configured with the appropriate labels and taints.

    Apply labels and taints to the nodes for Cloudera AI infrastructure scheduling by using the following command:

    Replace [***NODE NAME***] with the actual node name in the cluster.

    kubectl taint nodes [***NODE NAME***] cml-infra=true:NoSchedule
    kubectl label nodes [***NODE NAME***] cml-infra-node="true" --overwrite
    
  • A rolling restart of Cloudera Embedded Container Service is required to migrate non-Cloudera AI infrastructure workloads from nodes dedicated to Cloudera AI infrastructure, relocating them to non-Cloudera AI infrastructure nodes. This process is recommended in the following scenarios:
    • Initially, when tainting the nodes.
    • Subsequently, with each addition of new nodes (that is, applying new taints and labels to the nodes).
  • For Openshift Container Platform installations, ensure nodes are tainted before installing cpd-pvc. Afterward, a manual rolling restart of the deployment is necessary.
Enable Ring Fencing as part of provisioning a workbench.

Ring fencing can only be enabled during the creation of a workbench. If the cluster is configured for ring fencing, the enabling option appears during the workbench creation process.

Figure 1. Enabling Ring Fencing

Once ring fencing is enabled, all Cloudera AI infrastructure pods for that workbench will be exclusively scheduled on the dedicated Cloudera AI nodes, ensuring complete resource isolation.

To verify if Ring Fencing is enabled for a workbench, select 'View Workbench Details' from the 'Actions' menu.