This document describes the process to upgrade the database to the latest version
supported by Cloudera services. You may use Cloudera UI or CDP CLI to perform this upgrade.
Several Cloudera services, including the Data Lake cluster
and the Cloudera Data Hub cluster templates and Data Services, require a
relational database. Most of these databases are external and are provisioned during the
initial deployment of the respective service.
The databases used by the Data Lake and some of the Cloudera Data Hub templates are hosted on external instances that are
provisioned during the initial deployment of the respective service. For these external
databases Cloudera leverages cloud-native service offerings
of the three supported Cloud Service Providers (AWS RDS
for PostgreSQL, Azure Database for PostgreSQL, and Cloud
SQL for PostgreSQL).
Databases used by other Cloudera Data Hub templates are hosted
on an embedded database instance, typically co-located on the Cloudera Manager host, in order to reduce the resource footprint.
Cloudera provides a database upgrade capability
that allows moving both external and embedded databases to a higher major version.
The database upgrade is a fully automated operation. The upgrade process itself completes
all of the required steps, including creating a backup, stopping and upgrading the database,
restarting the database, and running post-upgrade maintenance tasks. You are not required to
manually stop the Postgres instances before the upgrade.
This is a one-time operation. Once the database of a Data Lake or Cloudera Data Hub has been successfully upgraded to the newer major version,
no further action is needed for the respective cluster.
If a cluster uses a database that requires an upgrade, you will receive a
notification, as shown below, on the Cloudera Management Console UI.
Running the database upgrade operation on the Cloudera Data Hub
cluster will mean that all cluster services (Cloudera Manager and Cloudera Runtime services) are stopped on the cluster automatically without
having to stop them manually. For the Data Lake database upgrade, it is recommended that
attached Cloudera Data Hub clusters and Data services are in stopped
state.
For AWS and GCP environments, the Database Upgrade operation will trigger a backup and a
major version upgrade for the attached external database. But for Azure environments, the
mechanism is different; as in the background, it will create a new database instance with a
higher major version and transfer the data from the older database instance.
Instructions
Here are the UI and CLI instructions to perform Database Upgrade on Data Lake and Cloudera Data Hub:
In Cloudera Management Console, go to
Environments. Select the cluster to perform the upgrade
from the list of available clusters. The clusters are eligible for this upgrade are
indicated in the right most column:
Once you select the cluster, you will see a message asking to update
the Postgres version. Click the Upgrade database.
Click Upgrade in the confirmation box.
Once the Data Lake database is updated, check for the Cloudera Data Hub clusters for that Data Lake, if there is any database
upgrade notification and perform the database upgrade as described above.
Data Lake Database upgrade:
You can perform Data Lake database upgrade using cdp datalake
start-database-upgrade CLI command.
The --target-version parameter is optional. If you do not provide it,
the database will be upgraded to PostgreSQL 14.
cdp datalake start-database-upgrade --help --form-factor public
NAME
start-database-upgrade - Upgrades the database of the Data Lake clus-
ter.
DESCRIPTION
This command initiates the upgrade of the database of the Data Lake
cluster.
SYNOPSIS
start-database-upgrade
--datalake <value>
--target-version <value>
[--cli-input-json <value>]
[--generate-cli-skeleton]
OPTIONS
--datalake (string)
The name or CRN of the Data Lake.
--target-version (string)
The database engine major version to upgrade to.
Possible values:
o VERSION_14
Cloudera Data Hub Database upgrade:
You can perform Cloudera Data Hub database upgrade using cdp
datahub start-database-upgrade CLI command.
The --target-version parameter is optional. If you do not provide it,
the database will be upgraded to PostgreSQL 14.
cdp datahub start-database-upgrade --help --form-factor public
NAME
start-datahub-upgrade - Upgrades the database of the Data Hub clus-
ter.
DESCRIPTION
This command initiates the upgrade of the database of the Data Hub
cluster.
SYNOPSIS
start-database-upgrade
--datahub <value>
--target-version <value>
[--cli-input-json <value>]
[--generate-cli-skeleton]
OPTIONS
--datahub (string)
The name or CRN of the Data Hub.
--target-version (string)
The database engine major version to upgrade to.
Possible values:
o VERSION_14
The progress of the upgrade can be tracked on the respective service’s Event
History page. You can verify a successful database upgrade in the Event History or in the
Database tab of the cluster. Once the upgrade is complete, Cloudera recommends verifying your workloads before
attempting an additional Cloudera Runtime or OS upgrade.
This site uses cookies and related technologies, as described in our privacy policy, for purposes that may include site operation, analytics, enhanced user experience, or advertising. You may choose to consent to our use of these technologies, or