Step 3: Backing Up the Cluster

Steps to back up your cluster before the upgrade.

My Environment

Fill in the following form to create a customized set of instructions for your environment.

Zero Downtime Upgrade?

New Cloudera Manager Version

Install Method

Operating System

HDFS High Availability

Ozone Upgrade Finalization

Using Cloudera Navigator

Current Cluster Version

New Cluster Version

Refreshing ContentFill out the form above before you proceed. Content Updated

To share this environment with others, click the icon next to My Environment to copy a link specific for this environment to the clipboard.

6.3.4 6.3.3 6.3.1 6.3.0 6.2.1 6.2.0 6.1.1 6.1.0 6.0.1 6.0.0 7.11.3 7.7.3 7.7.1 7.6.7 7.6.1 7.4.4 6.3.4 6.3.3 6.3.2 6.3.1 6.3.0 6.2.1 6.2.0 6.1.1 6.1.0 6.0.2 6.0.1 7.1.9.1000 7.1.9 7.1.8 7.1.7.3000 7.1.7.2000 7.1.7.1000 7.1.7

This topic describes how to back up a cluster managed by Cloudera Manager prior to upgrading the cluster. These procedures do not back up the data stored in the cluster. Cloudera recommends that you maintain regular backups of your data using the Backup and Disaster Recovery features of Cloudera Manager.

Minimum Required Role: Cluster Administrator (also provided by Full Administrator)

The following components do not require backups:

MapReduce
YARN
Spark
Impala

Complete the following backup steps before upgrading your cluster:

Back Up Databases🔗

Gather the following information:

Type of database (PostgreSQL, Embedded PostgreSQL, MySQL, MariaDB, or Oracle)
Hostnames of the databases
Database names
Port number used by the databases
Credentials for the databases

Open the Cloudera Manager Admin Console to find the database information for any of the following services you have deployed in your cluster:

Sqoop, Oozie, and Hue – Go to Cluster Name > Configuration > Database Settings.
note
The Sqoop Metastore uses a HyperSQL (HSQLDB) database. See the HyperSQL documentation for backup procedures.
note
Sqoop 2 is not supported in CDP Private Cloud Base.
Hive Metastore – Go to the Hive service, select Configuration, and select the Hive Metastore Database category.
Sentry – Go to the Sentry service, select Configuration, and select the Sentry Server Database category.
Ranger – Go to the Ranger service, select Configuration, and search on "database."
Queue Manager – Go to the Queue Manager service, select the Configuration tab. In the List of Filters on the left side, click the Category drop-down and select Database.
Schema Registry and Streams Messaging Manager – Select the service, go to Configuration, and select the Database category.

To back up the databases

Perform the following steps for each database you back up:

If not already stopped, stop the service.
1. On the Home > Status tab, click to the right of the service name and select Stop.
2. Click Stop in the next screen to confirm. When you see a Finished status, the service has stopped.
Back up the database. Substitute the database name, hostname, port, user name, and backup directory path and run the following command:
MySQL
```
mysqldump --databases
                    database_name
                    --host=database_hostname
                    --port=database_port -u
                    database_username -p >
                    backup_directory_path/database_name-backup-`date
                    +%F`-CDH.sql
```
PostgreSQL/Embedded
```
pg_dump -h database_hostname -U database_username -W -p database_port database_name > backup_directory_path/database_name-backup-`date +%F`-CDH.sql
```
Oracle

Work with your database administrator to ensure databases are properly backed up.
For additional information about backing up databases, see these vendor-specific links:
- MariaDB 10.2: https://mariadb.com/kb/en/backup-and-restore-overview/
- MySQL 5.7: https://dev.mysql.com/doc/refman/5.7/en/backup-and-recovery.html
- PostgreSQL10: https://www.postgresql.org/docs/10/static/backup.html
- Oracle 12c: https://docs.oracle.com/en/database/oracle/oracle-database/12.2/bradv/index.html
Start the service.
1. On the Home > Status tab, click to the right of the service name and select Start.
2. Click Start in the next screen to confirm. When you see a Finished status, the service has started.

Back Up ZooKeeper🔗

On all ZooKeeper hosts, back up the ZooKeeper data directory specified with the Data Directory property and ZooKeeper transaction log directory specified with the Transaction Log Directory property in the ZooKeeper configuration. The default location for both these directories is /var/lib/zookeeper.
For example:
```
cp -rp /var/lib/zookeeper/ /var/lib/zookeeper-backup-`date +%F`CM-CDH
```
To identify the ZooKeeper hosts, open the Cloudera Manager Admin console and go to the ZooKeeper service and click the Instances tab.

Record the permissions of the files and directories; you will need these to roll back ZooKeeper.

Back up Solr🔗

You are recommended to take a backup of Solr data and indexes before an upgrade, so that data can be restored during a rollback.

When backing up the Infra Solr service, make sure that you backup all of the following collections:

vertex_index (Atlas)
edge_index (Atlas)
fulltext_index (Atlas)
ranger_audits (Ranger)

Store them in a location that is accessible to the solr service user during a rollback. Cloudera recommends you target your backups to a shared file system, even if the Solr service uses local file system.

Before you start: Solr gateway has to be installed on all edge/client nodes.

Create a backup of Solr collections.
For information on backing up Solr collections, see Backing up a collection from HDFS (default location for Workload Solr) or Backing up a collection from local file system (default location for Infra Solr), depending on the type of storage used.

Back Up HDFS 🔗

Follow this procedure to back up an HDFS deployment.

If high availability is enabled for HDFS, run the following command on all hosts running the JournalNode role:
```
cp -rp /dfs/jn /dfs/jn-CM-CDH
```

On all NameNode hosts, back up the NameNode runtime directory. Run the following commands:

mkdir -p /etc/hadoop/conf.rollback.namenode

cd /var/run/cloudera-scm-agent/process/ && cd `ls -t1 | grep -e "-NAMENODE\$" | head -1`

cp -rp * /etc/hadoop/conf.rollback.namenode/

rm -f /etc/hadoop/conf.rollback.namenode/log4j.properties

cp -rp /etc/hadoop/conf.cloudera.HDFS_service_name/log4j.properties /etc/hadoop/conf.rollback.namenode/

These commands create a temporary rollback directory. If a rollback is required later, the rollback procedure requires you to modify files in this directory.

Back up the runtime directory for all DataNodes. Run the following commands on all DataNodes:

mkdir -p /etc/hadoop/conf.rollback.datanode/

cd /var/run/cloudera-scm-agent/process/ && cd `ls -t1 | grep -e "-DATANODE\$" | head -1`

cp -rp * /etc/hadoop/conf.rollback.datanode/

rm -f /etc/hadoop/conf.rollback.datanode/log4j.properties

cp -rp /etc/hadoop/conf.cloudera.HDFS_service_name/log4j.properties /etc/hadoop/conf.rollback.datanode/

If high availability is not enabled for HDFS, backup the runtime directory of the Secondary NameNode. Run the following commands on all Secondary NameNode hosts:

mkdir -p /etc/hadoop/conf.rollback.secondarynamenode/

cd /var/run/cloudera-scm-agent/process/ && cd `ls -t1 | grep -e "-SECONDARYNAMENODE\$" | head -1`

cp -rp * /etc/hadoop/conf.rollback.secondarynamenode/

rm -f /etc/hadoop/conf.rollback.secondarynamenode/log4j.properties

cp -rp /etc/hadoop/conf.cloudera.HDFS_service_name/log4j.properties /etc/hadoop/conf.rollback.secondarynamenode/

Back Up Key Trustee Server and Clients🔗

For the detailed procedure, see Backing Up Key Trustee Server and Clients.

Back Up HSM KMS🔗

When running the HSM KMS in high availability mode, if either of the two nodes fails, a role instance can be assigned to another node and federated into the service by the single remaining active node. In other words, you can bring a node that is part of the cluster, but that is not running HSM KMS role instances, into the service by making it an HSM KMS role instance–more specifically, an HSM KMS proxy role instance and an HSM KMS metastore role instance. So each node acts as an online ("hot" backup) backup of the other. In many cases, this will be sufficient. However, if a manual ("cold" backup) backup of the files necessary to restore the service from scratch is desirable, you can create that as well.

To create a backup, copy the /var/lib/hsmkp and /var/lib/hsmkp-meta directories on one or more of the nodes running HSM KMS role instances.

To restore from a backup: bring up a completely new instance of the HSM KMS service, and copy the /var/lib/hsmkp and /var/lib/hsmkp-meta directories from the backup onto the file system of the restored nodes before starting HSM KMS for the first time.

Back Up Navigator Encrypt🔗

It is recommended that you back up Navigator Encrypt configuration directory after installation, and again after any configuration updates.

To manually back up the Navigator Encrypt configuration directory (/etc/navencrypt):
```
$ zip -r --encrypt nav-encrypt-conf.zip /etc/navencrypt
```
The--encrypt option prompts you to create a password used to encrypt the zip file. This password is also required to decrypt the file. Ensure that you protect the password by storing it in a secure location.
Move the backup file (nav-encrypt-conf.zip) to a secure location.

Back Up HBase🔗

Because the rollback procedure also rolls back HDFS, the data in HBase is also rolled back. In addition, HBase metadata stored in ZooKeeper is recovered as part of the ZooKeeper rollback procedure.

If your cluster is configured to use HBase replication, Cloudera recommends that you document all replication peers. If necessary (for example, because the HBase znode has been deleted), you can roll back HBase as part of the HDFS rollback without the ZooKeeper metadata. This metadata can be reconstructed in a fresh ZooKeeper installation, with the exception of the replication peers, which you must add back. For information on enabling HBase replication, listing peers, and adding a peer, see HBase Replication in the CDH 5 documentation.

Back Up YARN Queue Manager🔗

Learn how to back up Yarn Queue Manager for CDP Private Cloud Base versions 7.1.8 and below. These steps are necessary if you want to upgrade to CDP Private Cloud Base 7.1.9 from version 7.1.8 and below as there is no ability to roll back changes if a CDP Private Cloud Base 7.1.9 upgrade is unsuccessful.

In Cloudera Manager, navigate to Clusters > Hosts. Backup the configuration service database.
Locate the host that has the Yarn Queue Manager Store running.
Find the location of the config-service database file by navigating to Cluster > QueueManager Service > Configurations tab > Scope, and click the Yarn Queue Manager Store.
Locate the Location for config-service DB field. If the field is empty, then use the default location: Database Location -> /var/lib/hadoop-yarn/
Open a SSH terminal and enter the following command: ssh [***your_username***]@[***queue_manager_host_ip_address***]
Navigate to the directory where the configuration database file is stored: cd{Database Location}
Find two database files with these names: -config-service.mv.db
-config-service.trace.db
Notice that config-service.trace.db is in the same location.
Secure copy the config-service.mv.db andconfig-service.trace.dbfiles to the machines where the backups are to be stored. For example: scp -i ~/.ssh/{ssh_key} config-service.mv.db root@{hostName}:{Your_Backup_Folder}/config-service.mv.db
Use sha1sum to verify that the files in the current host and the location of where the backup is stored have the same hash.

Back up Atlas🔗

When you plan to back up your Atlas data, it is a two-step process where you must first back up Solr and HBase data before proceeding further.

Back up Solr🔗

Follow the instructions to back up your data in Solr. You must run these commands on single solr server.

curl -ivk "https://host1.example.com:8993/solr/admin/collections?action=BACKUP&name=vertex_index_bkp&collection=vertex_index&location=/tmp/"
curl -ivk "https://host1.example.com:8993/solr/admin/collections?action=BACKUP&name=edge_index_bkp&collection=edge_index&location=/tmp/"
curl -ivk "https://host1.example.com:8993/solr/admin/collections?action=BACKUP&name=fulltext_index_bkp&collection=fulltext_index&location=/tmp/"

note

/tmp/ this can be replaced with the backup directory path which is to be used to store Solr backup. This path needs to be accessible for the solr service user.
If the cluster is kerberized, then run kinit against Solr keytab first and add "--negotiate -u :" after the -ivk flag in the above curl commands. Example: curl -ivk --negotiate -u : https://host1.example.com:8993…
If you have multiple Solr services, ensure you create the Solr backup directories on all the services. Else the backup fails indicating that there should be a shared storage used for backup.
In some cases if Shards are present on different nodes, backup might fail with following message: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error from server at http://host1.example.com:8993/solr: Failed to backup core=vertex_index_shard because org.apache.solr.common.SolrException: Directory to contain snapshots doesn't exist: file:///tmp/vertex_index. Note that Backup/Restore of a SolrCloud collection requires a shared file system mounted at the same path on all nodes!"
Solr recommends having a backup using HDFS repository in such a scenario. For more information, see Backup and Restore Solr Repositories.

Back up HBase🔗

Follow the instructions to back up your data in HBase:

If the cluster is kerberized, then run kinit against HBase keytab

Create HBase table snapshot:
1. hbase shell
  hbase> snapshot 'atlas_janus', 'atlas_janus_snapshot_<insert-date-here>'
  hbase> snapshot 'ATLAS_ENTITY_AUDIT_EVENTS', 'atlas_entity_audit_events_snap_<insert-date-here>'
  → exit
  note
  You can use the Table Browser in Cloudera Manager to take a snapshot from the atlas_janus table.
Export Snapshot from server terminal:
1. hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot 'atlas_janus_snapshot_<insert-date-here>' -copy-to /tmp/hbasebackup/
2. hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot 'atlas_entity_audit_events_snap_<insert-date-here>' -copy-to /tmp/hbasebackup/

The contents of '/tmp/hbasebackup/' contain the table backup.

In case of below error:

ERROR snapshot.ExportSnapshot: Snapshot export failed org.apache.hadoop.security.AccessControlException: Permission denied: user=hbase, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker. java:553)

To resolve the above error, provide the necessary permission to “hbase” user in “all - path” policy in the cm_hdfs service in Ranger. Also ensure, “hbase” user has permission in the “hbase-archive” policy as well.

Back Up Sqoop 2🔗

If you are not using the default embedded Derby database for Sqoop 2, back up the database you have configured for Sqoop 2. Otherwise, back up the repository subdirectory of the Sqoop 2 metastore directory. This location is specified with the Sqoop 2 Server Metastore Directory property. The default location is: /var/lib/sqoop2. For this default location, Derby database files are located in /var/lib/sqoop2/repository.

Back Up Hue🔗

Back up the app registry file on all hosts running the Hue Server role if you have installed CDP Private Cloud Base using RPM packages.

The app registry file (app.reg) is present in the /usr/lib/hue directory if you have installed Hue using the RPM package. It is a JSON file which contains the details of all apps that are used within Hue. If you have installed Hue using the parcels, then the app.reg file may not be present on your system, and you do not need to back it up.

Run the following command to back up the app.reg file for installations using RPM packages:

cp -rp /usr/lib/hue/app.reg /usr/lib/hue_backup/app.reg-CM-CDH

Back Up Impala🔗

Back up the Impala profile logs and logs of catalogd, statestore, and coordinators that can help investigate performance regressions. Perform the following tasks to back up the Impala cluster:

Back up query profile logs of Impala. The logs are present in the /profiles directory under the coordinator log directory and the filename is similar to impala_profile_log_1.1-1715070416212.
To investigate performance regressions that are introduced after an upgrade, you will require the query profiles from the previous cluster (older version). You can use tools like impala-profile-tool to decode the query profiles into text profiles and then search for the runs before the upgrade for a comparison.
Back up logs of catalogd, statestore, and coordinators that can also be helpful in troubleshooting. By default, the Impala logs are stored at /var/log/catalogd/, /var/log/statestore/, and /var/log/impalad/.
The log folders can be configured in Cloudera Manager by navigating to Clusters > IMPALA and searching for the "Catalog Server Log Directory", "StateStore Log Directory", and "Impala Daemon Log Directory" properties.