There is a list of pre-upgrade checks that will run after the upgrade version has
been chosen. This checklist verifies if your cluster is ready for upgrade.
In the Cloudera Manager UI, under Getting
Started for Upgrade Step in the Upgrade
wizard, select the repository URL for upgrade. A pre-upgrade checklist will appear
that verifies that the hosts and services in your cluster are ready for the
upgrade.
Click on Download Upgrade Validator,
this will download the upgrade validator onto all your Cloudera Embedded Container Service hosts and is needed to run the
Control Plane Health Check and Docker
Registry Health Check.
Once the Download Upgrade Validator is completed, the
Control Plane Health Check and Docker Registry
Health Check will automatically run.
Here is the pre-upgrade checklist:
Checklist
Description
Hosts check
This verifies the host health status, runs the host prerequisite
inspections, and host warning inspections.
Host Health Status- This check verifies that there are no
hosts in bad health or concerning health. It also checks for any
stopped roles on the hosts.
Host Prerequisites Inspections - These are host inspections
that must pass in order for you to proceed to upgrade. Currently the
prerequisite inspection includes:
EcsHostDnsInspection - Checks to make sure that there
are less than three nameserver entries in the
/etc/resolv.conf file, and checks
the connections to the Cloudera Manager
cluster and the Cloudera console. It
also checks to see if
vault.localhost.localdomain's ping
can be resolved. If not, it is likely that the host
/etc/nsswitch.conf file is
misconfigured.
If this inspection fails:
Check the /etc/resolv.conf
and /etc/nsswitch.conf files
and ensure that
/etc/resolv.conf does not
contain three or more nameservers, and that
/etc/nsswitch.conf must
contain myhostname under the
hosts field.
Check to see the connections are resolved
correctly. If the connection to the Cloudera
console fails, check to see if your DNS wildcard
is configured properly.
Host Warning Inspections - These are host inspections that
are used to detect potential factors that can cause issues during an
upgrade. Currently the warning inspections include:
SecuritySoftwareInspection - Checks to make sure that
there are no security software processes running on the
hosts in the cluster.
Upgrade Storage Inspection - Checks to make sure
there is at least 100 GB of free space under
/var/lib/ and 200 GB of free space
under the docker data directory.
Services Health Check
This verifies that there are no services in bad or concerning
health.
Download Upgrade Validator
This downloads the upgrade validator used to verify the control plane
and docker registry health checks onto all the hosts in the
cluster.
Control Plane Health Check
This verifies the control plane is in a healthy state before upgrade.
Here is the list of things it checks:
Longhorn Health Check: This verifies that all the
longhorn volumes are in a healthy, robust state. It also verifies
that PVCs are bound.
Longhorn Engine Check: This verifies that the
longhorn engine version matches the current longhorn manager
version.
RKE2 Health Check: This verifies that the
Kubernetes API server is reachable and the nodes are in a Ready and
Schedulable state.
Pod Readiness Health Check: This verifies that all
the pods in the kube-system, longhorn-system, vault-system,
yunikorn, Kubernetes, ecs-webhooks, and cdp
namespaces are in a Ready state.
Vault Health Check: This verifies that the vault-0
pod is running and the vault is unsealed.
Docker Registry Health Check
This verifies that the selected docker registry is ready for upgrade:
It verifies the connection to the docker registry by
pulling an image.
For custom registry setups, it will also verify that the
new required images stated in the manifest.json are present in your
registry before upgrade.