Network Planning for Cloudera Machine Learning on Azure
Before attempting to deploy your Azure virtual network and set up your Azure Environment and ML Workspaces, you should plan the network.
As an example, a minimum architecture to support two ML Workspaces would comprise the following:
- An Azure virtual network. Cloudera Machine Learning can use an existing virtual network if available.
- One subnet dedicated to the Azure NetApp Files service.
- One subnet for each ML Workspace.
Keep the following considerations in mind when planning your network:
- Each ML workspace requires one subnet in the virtual network.
- Plan the CIDR addresses for each subnet so that the ranges do not overlap.
- Each subnet should use a /26 CIDR. This should accommodate a maximum of 30 worker nodes as well as 4 infrastructure nodes for Cloudera Machine Learning.
- To use GPUs, create a virtual network with /25 CIDR subnets to accommodate a maximum of 30 GPU nodes. If you created a /26 CIDR network originally, and then subsequently need to add GPU support, you must create a new network with /25 CIDR subnets. Subnets cannot be resized.
- The recommended NFS for use with Cloudera Machine Learning on Azure is Azure NetApp Files v3.
- Subnets may not use the following reserved CIDR blocks: 10.0.0.0/16 or 10.244.0.0/16
- Even if private IP addresses are used for the CML service, each AKS cluster provisions a public IP for egress traffic, communication with the Kubernetes control plane, and backwards compatibility. For more information, see Use a public Standard Load Balancer in Azure Kubernetes Service (AKS).
- In a Single Resource Group setup in Azure, the entire CDP stack is supposed to use the resource group provided by the customer, and not create any new resource groups. However, note that Azure will automatically create new resource groups for Azure managed resources, such as a new resource group for AKS compute worker nodes.
- Each CML workspace cluster must not share any subnets or routing tables with any other CDP experience or AKS cluster.
- After provisioning a workspace, do not remove firewall rules to allow inbound and outbound access.
- Installations must comply with firewall requirements set by cloud providers at all times. Ensure that ports required by the provider are not closed. For example, Kubernetes services have requirements documented in Use Azure Firewall to protect Azure Kubernetes Service (AKS) Deployments.