Required Databases

The following components all require databases: Cloudera Manager Server, Oozie Server, Sqoop Server, Reports Manager, Hive Metastore Server, Hue Server, DAS server, and Ranger.

The type of data contained in the databases and their relative sizes are as follows:

  • Cloudera Manager Server - Contains all the information about services you have configured and their role assignments, all configuration history, commands, users, and running processes. This relatively small database (< 100 MB) is the most important to back up.
  • Oozie Server - Contains Oozie workflow, coordinator, and bundle data. Can grow very large.
  • Sqoop Server - Contains entities such as the connector, driver, links and jobs. Relatively small.
  • Reports Manager - Tracks disk utilization and processing activities over time. Medium-sized.
  • Hive Metastore Server - Contains Hive metadata. Relatively small.
  • Hue Server - Contains user account information, job submissions, and Hive queries. Relatively small.
  • DAS PostgreSQL server - Contains Hive and Tez event logs and DAG information. Can grow very large.
  • Ranger Audit - Apache Ranger uses Apache Solr to store audit logs and provides UI searching through the audit logs. The size of the audit database depends on the events in the environment. You should monitor and manage the auditing destinations. 1TB of space is recommended.
  • Ranger Admin - Contains administrative information such as Ranger users, groups, and access policies. Medium-sized.

The Host Monitor and Service Monitor services use local disk-based datastores.

The JDBC connector for your database must be installed on the host where you assign the Reports Manager role.

For instructions on installing and configuring databases for Cloudera Manager, Runtime, and other managed services, see the instructions for the type of database you want to use.