Limitations on Azure

This section lists some resource limits that CML and Azure impose on workloads running in ML workspaces.

  • You have to request access from Microsoft to set up an Azure NetApp Files account. You are prompted to do this when you first try to set the account up, and it may take some time for Microsoft to grant access.
  • There is no ability to grant or revoke remote access (via Kubeconfig) to specific users. Users with the MLAdmin role in the environment can download a Kubeconfig file. The Kubeconfig file will continue to allow access even if the MLAdmin role is later revoked.

  • Azure Kubernetes Service (AKS) does not allow the CPU worker group to scale down to 0 nodes; there will always be at least 1 worker, even on an idle workspace.

  • Scaling down CPU workers can sometimes be delayed by the scheduling of AKS services on the node in question. For more information, see AKS issue 875.

  • Support is limited to regions that provide AKS. Also, customers should check availability of Azure NetApp Files and GPU instance types in their intended region.

  • Each ML workspace requires a separate subnet. For more information on this issue, see Use kubenet networking with your own IP address ranges in Azure Kubernetes Service (AKS).

  • Heavy AKS activity can cause default API rate limits to trigger, causing throttling and eventually failures for AKS clusters. For some examples, see AKS issue 1187 and AKS issue 1413.