Using Azure Database for PostgreSQL Flexible Server

CDP uses Azure Database for PostgreSQL Flexible Server. The Flexible Server allows a highly available database to be deployed for Data Lake and Data Hub clusters. You can create Flexible Server instances with public access, where the Azure Database for PostgreSQL server is accessed through a public endpoint, or with private access, where the flexible server has no public endpoint accessible through the internet. The latter option requires a private DNS zone to be specified, and is either possible through the use of Azure Private Link for the Azure Database for PostgreSQL service, or, when Private Link is not used, it requires a delegated subnet to be created and added to your CDP Azure environment beforehand.

Using the Flexible Server offers the following benefits to CDP customers:

  • Flexible Server allows you to deploy PostgreSQL version 14 and above. See Supported PostgreSQL major versions in Azure Database for PostgreSQL - Flexible Server.

  • Unlike the previously used Single Server database instances, Flexible Server database instances can be stopped and restarted during Data Lake and Data Hub cluster stop and restart. This offers a great cost-saving opportunity.
  • Flexible Server is multi-AZ capable and offers zone-redundant High Availability. With the Flexible Server, Data Lakes are backed with a highly-available PostgreSQL configuration of two instances. When using a multi-AZ deployment, the Flexible Server instances are deployed in multiple availability zones for additional fault tolerance.

For a detailed comparison of Single Server and Flexible Server offerings, refer to the Comparison chart: Azure Database for PostgreSQL - Flexible Server vs. Single Server in the Azure documentation.

Database server options

You can create Flexible Server instances with public access, where the Azure Database for PostgreSQL server is accessed through a public endpoint, or with private access , where the Flexible Server has no public endpoint accessible through the internet. There are two private Flexible Server options: Flexible Server using Azure Private Link for the Azure Database for PostgreSQL service or Flexible Server using a Delegated Subnet that needs to be created and added to your CDP Azure environment beforehand. Flexible Server with Private Link is the more advanced and recommended option as it offers easy connectivity to other Azure services that are utilizing Private Link for networking, the ability for a server to be reachable from public and private networks via both private and public addressing at the same time, and does not require delegated subnets to be created. For details, see Private Link with Azure Database for PostgreSQL - Flexible Server. Both private access options require a private DNS zone to be specified.

The Flexible Server with public access is used by default (as it does not require any special networking setup), but during environment creation you can specify to use the Flexible Server with private access, either with Private Link or with delegated subnet. It is also possible to specify to use the Single Server. However, the Flexible Server with Delegated Subnet and the Single Server options are marked as deprecated. It is not recommended to use these as they will be phased out. Azure Single Server only supports Postgres versions up to 11. Postgres versions newer than 11 require the use of Azure Flexible Server. For details, see What happens to Azure Database for PostgreSQL - Single Server after the retirement announcement?

New CDP environments on Azure automatically use Flexible Server with public endpoints and Data Hubs automatically inherit the settings from the environment they run in, but you can also enable Flexible Server when creating a Data Hub.

In general, when registering an Azure environment and creating a Data Hub, you can choose to use Flexible Server, Flexible Server with Private Link, Flexible Server with Delegated Subnet (deprecated), Single Server (deprecated), or Single Server with Private Link (deprecated), but the exact options vary depending on the environment settings. You have the following options when Data Lake or Data Hub creation is initiated:

  • If the parent environment has been configured for private access, and the database type is set as Flexible Server, the Data Hub or Data Lake cluster is launched with Flexible Server with Private Link if no delegated subnets are specified.
  • If the parent environment has been configured for private access, and the database type is set as Flexible Server, the Data Hub or Data Lake cluster is launched with Flexible Server with delegated subnet if there is a delegated subnet specified.
  • If the parent environment has been configured for private access, and the database type is not set, a cluster with a Flexible Server is launched. You must specify the database type as Single Server or Single Server with Private Link to launch the Data Hub or Data Lake with a Single Server database.
  • If the environment has been configured for public access, and the database type is not set specifically as Single Server, the Data Hub or Data Lake cluster is launched with public Flexible Server.
  • If the environment has been configured for public access, and the database type is set as Single Server, the Data Hub or Data Lake cluster is launched with public Single Server.
  • If an environment is configured with Azure Single Server and Service Endpoints, by default new Data Hub clusters provisioned in this environment will be created with Public Flexible Servers. If in this case you would still like to use a Single Server instead, then the --database-type=SINGLE_SERVER CLI parameter should be used, or you can achieve the same on the UI under the Advanced Data Hub options.

Limitations

The following limitations currently apply:

  • If the server-side encryption (SSE) that the Data Lake or Data Hubs are encrypted with is configured with Customer Managed Keys (CMK), an upgrade from Azure Single Server to Azure Flexible Server is currently not available. This capability is expected to be released soon.

In order to set up this feature, you should review Azure prerequisites and then you can enable Flexible Server during Azure environment registration in CDP or upgrade existing Azure Databse for PostreSQL Single Servers to Flexible servers as described in the linked documentation: