Creating a Database Catalog

When you activate an environment in CDW, a default Database Catalog is created. You can create a new Database Catalog from the Data Warehouse UI.

You can configure the following sizes of the Java Heap for your Database Catalog workload:
  • Small (default), 8Gb
  • Medium, 16Gb
  • Large, 24Gb

The size you configure for the Database Catalog determines the container size of Hive Metastore (HMS) for storing your workload metadata. To avoid unnecessary cloud expenses, do not increase the size unless you experience Java heap issues.

Given Ranger permissions, you can access any objects or data sets created in the Data Hub or the Data Engineering clusters from CDW Virtual Warehouses and vice versa. The CDW service sets up the Kubernetes cluster, which provides the computing resources for the Database Catalog. The CDW service uses the existing data lake that was set up for the environment, including all data, metadata, and security. The following procedure shows you steps to create a Database Catalog to replace your default Database Catalog.

  1. Navigate to Cloudera Data Warehouse Overview.
  2. Click Database Catalogs, in Create, click See More > New Database Catalog.
  3. In Name, specify a Database Catalog name.
  4. In Environments, select the name of your environment.

    If you do not see the environment you want in the drop-down list, you might need to activate the environment.

  5. In Select Size, accept the default (Small), or select Medium or Large.
  6. In Database Catalog Image Version, select the Hive version for constructing the Hive Metastore.
  7. Optionally turn on Load Demo Data to use sample airline data in Hue.
  8. Click Create Database Catalog.