Gateways and route tables

This topic covers recommended gateway and route table configurations for CDP Public Cloud for Azure.

Connectivity from Control Plane to CDP workloads

  • As described in the Subnets section above, each CDP data service workload requires its own subnet and a non-shared route table associated with it (see Bring your own subnet and route table with kubenet).
  • As described in the Taxonomy of network architectures, nodes in the CDP workloads need to connect to the CDP Control Plane over the internet to establish a “tunnel” over which the CDP Control Plane can send instructions to the workloads.
  • Private AKS cluster is a feature that lets CDP access a Kubernetes workload cluster over a private IP address (see Enabling a private CDW environment in Azure Kubernetes Service). When it is enabled for a CDW workload, CDW requires the user to have already set up internet connectivity for the subnet (see Outbound type of userDefinedRouting).
  • CDP data services such as Datalake and CML create a public load balancer for internet connectivity.
  • If a firewall is configured, the destinations described in Azure outbound network access destinations need to be allowed for the CDP workloads to work.

Connectivity from customer on-prem to CDP workloads

  • As described in the Use cases section (see CDP Public Cloud reference network architecture for Azure>Use Cases), data consumers need to access data processing or consumption services in the CDP workloads.
  • Given that these are created with private IP addresses in private subnets, the customers need to arrange for access to these addresses from their on-prem or corporate networks in specific ways.
  • There are several possible solutions for achieving this, but one that is depicted in the architectural diagram, uses Azure VPN Gateway.
  • Each workload provides an option to connect to the workload endpoint over a public or a private IP. It's recommended that the public IPs are disabled and that users rely on the corporate network connectivity for accessing the workloads with private IPs.