Backing Up Cloudera Manager

This topic contains procedures to back up Cloudera Manager. Cloudera recommends that you perform these backup steps before upgrading. The backups will allow you to rollback your Cloudera Manager upgrade if needed.

Loading Filters ... 7.1.4 7.1.3 7.1.2 7.1.1 7.0.3 6.3.3 6.3.1 6.3.0 6.2.1 6.2.0 6.1.1 6.1.0 6.0.1 6.0.0 5.16 5.15 5.14 5.13 6.3.3 6.3.2 6.2.1 6.2.0 6.1.1 6.1.0 6.0.1 6.0.0 5.16 5.15 5.14 5.13 7.1.4 7.1.3 7.1.2 7.1.1 7.0.3 7.1.4 7.1.3 7.1.2 7.1.1 7.0.3

Collect Information for Backing Up Cloudera Manager

  1. Log in to the Cloudera Manager Server host.
    ssh my_cloudera_manager_server_host
  2. Collect database information by running the following command:
    cat /etc/cloudera-scm-server/db.properties
    For example:
    ...
    com.cloudera.cmf.db.type=...
    com.cloudera.cmf.db.host=database_hostname:database_port
    com.cloudera.cmf.db.name=scm
    com.cloudera.cmf.db.user=scm
    com.cloudera.cmf.db.password=SOME_PASSWORD
  3. Collect information (host name, port number, database name, user name and password) for the following databases.
    • Reports Manager
    • Activity Monitor

    You can find the database information by using the Cloudera Manager Admin Console. Go to Clusters > Cloudera Management Service > Configuration and select the Database category. You may need to contact your database administrator to obtain the passwords.

  4. Find the host where the Service Monitor, Host Monitor and Event Server roles are running. Go to Clusters > Cloudera Manager Management Service > Instances and note which hosts are running these roles.

Back Up Cloudera Manager Agent

Backup up the following Cloudera Manager agent files on all hosts:

  • Create a top level backup directory.
    export CM_BACKUP_DIR="`date +%F`-CM"
    echo $CM_BACKUP_DIR
    mkdir -p $CM_BACKUP_DIR
  • Back up the Agent directory and the runtime state.
    
    sudo -E tar -cf $CM_BACKUP_DIR/cloudera-scm-agent.tar --exclude=*.sock /etc/cloudera-scm-agent /etc/default/cloudera-scm-agent /var/run/cloudera-scm-agent /var/lib/cloudera-scm-agent
  • Back up the existing repository directory.
    RHEL / CentOS
    sudo -E tar -cf $CM_BACKUP_DIR/repository.tar /etc/yum.repos.d
    SLES
    sudo -E tar -cf $CM_BACKUP_DIR/repository.tar /etc/zypp/repos.d
    Ubuntu
    sudo -E tar -cf $CM_BACKUP_DIR/repository.tar /etc/apt/sources.list.d

Back Up the Cloudera Management Service

  1. Stop the Cloudera Management Service.
    1. Log in to the Cloudera Manager Admin Console.
    2. Select Clusters > Cloudera Management Service.
    3. Select Actions > Stop.
  2. On the host where the Service Monitor role is configured to run, backup the following directory:
    sudo cp -rp /var/lib/cloudera-service-monitor /var/lib/cloudera-service-monitor-`date +%F`-CM
  3. On the host where the Host Monitor role is configured to run, backup the following directory:
    sudo cp -rp /var/lib/cloudera-host-monitor /var/lib/cloudera-host-monitor-`date +%F`-CM
  4. On the host where the Event Server role is configured to run, back up the following directory:
    sudo cp -rp /var/lib/cloudera-scm-eventserver /var/lib/cloudera-scm-eventserver-`date +%F`-CM
  5. Start the Cloudera Management Service.
    1. Log in to the Cloudera Manager Admin Console.
    2. Select Clusters > Cloudera Management Service.
    3. Select Actions > Start.

Back Up Cloudera Navigator Data

  1. Make sure a purge task has run recently to clear stale and deleted entities.
    • You can see when the last purge tasks were run in the Cloudera Navigator console (From the Cloudera Manager Admin console, go to Clusters > Cloudera Navigator. Select Administration > Purge Settings.)
    • If a purge hasn't run recently, run it by editing the Purge schedule on the same page.
    • Set the purge process options to clear out as much of the backlog of data as you can tolerate for your upgraded system. See Managing Metadata Storage with Purge.
  2. Stop the Navigator Metadata Server.
    1. Go to Clusters > Cloudera Management Service > Instances.
    2. Select Navigator Metadata Server.
    3. Click Actions for Selected > Stop.
  3. Back up the Cloudera Navigator Solr storage directory.
    sudo cp -rp /var/lib/cloudera-scm-navigator /var/lib/cloudera-scm-navigator-`date +%F`-CM
  4. If you are using an Oracle database for audit, in SQL*Plus, ensure that the following additional privileges are set:
    
      GRANT EXECUTE ON sys.dbms_crypto TO nav;
      GRANT CREATE VIEW TO nav;
    where nav is the user of the Navigator Audit Server database.

Stop Cloudera Manager Server & Cloudera Management Service

  1. Stop the Cloudera Management Service.
    1. Log in to the Cloudera Manager Admin Console.
    2. Select Clusters > Cloudera Management Service.
    3. Select Actions > Stop.
  2. Log in to the Cloudera Manager Server host.
    ssh my_cloudera_manager_server_host
  3. Stop the Cloudera Manager Server.
    sudo systemctl stop cloudera-scm-server

Back Up the Cloudera Manager Databases

  1. Back up the Cloudera Manager server database – Run the following command. (The command displayed below depends on the database you selected in the form at the top of this page. Replace placeholders with the actual values returned from the db.properties file):
    MySQL
    
    mysqldump --databases database_name --host=database_hostname --port=database_port -u user_name -p > $HOME/database_name-backup-`date +%F`-CM.sql
    PostgreSQL/Embedded
    
    pg_dump -h database_hostname -U user_name -W -p database_port database_name > $HOME/database_name-backup-`date +%F`-CM.sql
    Oracle
    Work with your database administrator to ensure databases are properly backed up.

    For more information about backing up databases, see Backing up databases.

  2. Back up All other Cloudera Manager databases - Use the database information that you collected in a previous step. You may need to contact your database administrator to obtain the passwords.
    These databases can include the following:
    • Cloudera Manager Server - Contains all the information about services you have configured and their role assignments, all configuration history, commands, users, and running processes. This relatively small database (< 100 MB) is the most important to back up.
    • Oozie Server - Contains Oozie workflow, coordinator, and bundle data. Can grow very large. (Only available when installing CDH 5 or CDH 6 clusters.)
    • Sqoop Server - Contains entities such as the connector, driver, links and jobs. Relatively small. (Only available when installing CDH 5 or CDH 6 clusters.)
    • Reports Manager - Tracks disk utilization and processing activities over time. Medium-sized.
    • Hive Metastore Server - Contains Hive metadata. Relatively small.
    • Hue Server - Contains user account information, job submissions, and Hive queries. Relatively small.
    • Sentry Server - Contains authorization metadata. Relatively small.
    • Cloudera Navigator Audit Server - Contains auditing information. In large clusters, this database can grow large.(Only available when installing CDH 5 or CDH 6 clusters.)
    • Cloudera Navigator Metadata Server - Contains authorization, policies, and audit report metadata. Relatively small.(Only available when installing CDH 5 or CDH 6 clusters.)
    • DAS PostgreSQL server - Contains Hive and Tez event logs and DAG information. Can grow very large.
    • Ranger Admin - Contains administrative information such as Ranger users, groups, and access policies. Medium-sized.
    • Streaming Components:
      • Schema Registry - Contains the schemas and their metadata, all the versions and branches. You can use either MySQL, Postgres, or Oracle.
      • Streams Messaging Manager Server - Contains Kafka metadata, stores metrics, and alert definitions. Relatively small.

    Run the following commands to back up the databases. (The command displayed below depends on the database you selected in the form at the top of this page. Replace placeholders with the actual values.):

    MySQL
    
    mysqldump --databases database_name --host=database_hostname --port=database_port -u database_username -p > $HOME/database_name-backup-`date +%F`-CM.sql
    PostgreSQL/Embedded
    
    pg_dump -h database_hostname -U database_username -W -p database_port database_name > $HOME/database_name-backup-`date +%F`-CM.sql
    Oracle
    Work with your database administrator to ensure databases are properly backed up.

Back Up Cloudera Manager Server

  1. Log in to the Cloudera Manager Server host.
    ssh my_cloudera_manager_server_host
  2. Create a top-level backup directory.
    export CM_BACKUP_DIR="`date +%F`-CM"
    echo $CM_BACKUP_DIR
    mkdir -p $CM_BACKUP_DIR
  3. Back up the Cloudera Manager Server directories:
    
    sudo -E tar -cf $CM_BACKUP_DIR/cloudera-scm-server.tar /etc/cloudera-scm-server /etc/default/cloudera-scm-server
  4. Back up the existing repository directory.
    RHEL / CentOS
    sudo -E tar -cf $CM_BACKUP_DIR/repository.tar /etc/yum.repos.d
    SLES
    sudo -E tar -cf $CM_BACKUP_DIR/repository.tar /etc/zypp/repos.d
    Ubuntu
    sudo -E tar -cf $CM_BACKUP_DIR/repository.tar /etc/apt/sources.list.d

(Optional) Start Cloudera Manager Server & Cloudera Management Service

Start the Cloudera Manager server and Cloudera Manager Management service.

If you will be immediately upgrading Cloudera Manager, skip this step and continue with Upgrading the Cloudera Manager Server.

  1. Log in to the Cloudera Manager Server host.
    ssh my_cloudera_manager_server_host
  2. Start the Cloudera Manager Server.
    sudo systemctl start cloudera-scm-server
  3. Start the Cloudera Management Service.
    1. Log in to the Cloudera Manager Admin Console.
    2. Select Clusters > Cloudera Management Service.
    3. Select Actions > Start.