Required Databases

The following components all require databases: Cloudera Manager Server, Oozie Server, Sqoop Server, Reports Manager, Hive Metastore Server, Hue Server, and Ranger.

The type of data contained in the databases and their relative sizes are as follows:

  • Cloudera Manager Server - Contains all the information about services you have configured and their role assignments, all configuration history, commands, users, and running processes. This relatively small database (< 100 MB) is the most important to back up.
  • Oozie Server - Contains Oozie workflow, coordinator, and bundle data. Can grow very large. (Only available when installing CDH 5 or CDH 6 clusters.)
  • Sqoop Server - Contains entities such as the connector, driver, links and jobs. Relatively small. (Only available when installing CDH 5 or CDH 6 clusters.)
  • Reports Manager - Tracks disk utilization and processing activities over time. Medium-sized.
  • Hive Metastore Server - Contains Hive metadata. Relatively small.
  • Hue Server - Contains user account information, job submissions, and Hive queries. Relatively small.
  • YARN Queue Manager - If you install CDP 7.1.9 CHF 2 or later, no database changes are required. YARN Queue Manager will continue to use your current embedded database. If iyou install CDP 7.1.9 or CDP 7.1.9 CHF 1, you must use a PostgreSQL database which stores information about queues created by YARN Queue Manager. If you choose to update from an earlier CDP 7.1.9 version to CHF 2, CDP will continue to use your PostgreSQL database and will not migrate back to the embedded database.
  • Sentry Server - Contains authorization metadata. Relatively small.
  • Cloudera Navigator Audit Server - Contains auditing information. In large clusters, this database can grow large.(Only available when installing CDH 5 or CDH 6 clusters.)
  • Cloudera Navigator Metadata Server - Contains authorization, policies, and audit report metadata. Relatively small.(Only available when installing CDH 5 or CDH 6 clusters.)
  • Ranger Admin - Contains administrative information such as Ranger users, groups, and access policies. Medium-sized.
  • Ranger KMS database - Stores the encrypted keys.
  • Streaming Components:
    • Schema Registry - Contains the schemas and their metadata, all the versions and branches. You can use either MySQL, Postgres, or Oracle.
    • Streams Messaging Manager Server - Contains Kafka metadata, stores metrics, and alert definitions. Relatively small.

The Host Monitor and Service Monitor services use local disk-based datastores.

The JDBC connector for your database must be installed on the hosts where you assign the Activity Monitor and Reports Manager roles.

For instructions on installing and configuring databases for Cloudera Manager, Runtime, and other managed services, see the instructions for the type of database you want to use.