Limitations on Azure

This section lists some resource limits that CML and Azure impose on workloads running in ML workspaces.

  • You have to request access from Microsoft to set up an Azure NetApp Files account. You are prompted to do this when you first try to set the account up, and it may take some time for Microsoft to grant access.
  • There is no ability to grant or revoke remote access (via Kubeconfig) to specific users. Users with the MLAdmin role in the environment can download a Kubeconfig file. The Kubeconfig file will continue to allow access even if the MLAdmin role is later revoked.

  • Azure Kubernetes Service (AKS) does not allow the CPU worker group to scale down to 0 nodes; there will always be at least 1 worker, even on an idle workspace.

  • Scaling down CPU workers can sometimes be delayed by the scheduling of AKS services on the node in question. For more information, see AKS issue 875.

  • Support is limited to regions that provide AKS. Also, customers should check availability of Azure NetApp Files and GPU instance types in their intended region.

  • Data is not encrypted in transit to Azure NetApp Files or other NFS systems, so make sure to implement policies to ensure security at the network level.
  • Each ML workspace requires a separate subnet. For more information on this issue, see Use kubenet networking with your own IP address ranges in Azure Kubernetes Service (AKS).

  • Heavy AKS activity can cause default API rate limits to trigger, causing throttling and eventually failures for AKS clusters. For some examples, see AKS issue 1187 and AKS issue 1413.

  • When you provision an Azure Kubernetes (AKS) cluster, a Standard load balancer is provisioned by default. The Standard load balancer always provisions a public IP for egress traffic, communication with the Kubernetes control plane, and backwards compatibility. Cloudera software does not use this public IP directly, or expose anything on it. It is currently not possible to provide networking rules or other mechanisms to enable public AKS to work. For more information, see: Use a public Standard Load Balancer in Azure Kubernetes Service (AKS)