Multiple Database Catalogs for multi-tenancy

Your organization can use Cloudera Data Warehouse (CDW) to create and host proprietary data warehouses and data marts. You learn how to create multiple default Database Catalogs for a secure, multi-tenant environment. Consequently, you can provide different Hue instances to each tenant for business analytics.

Database Catalogs are HMS instances that share the Hive metastore database used by your CDP environment. An environment can have multiple Database Catalogs, which include references to the cloud storage for data. When you activate an environment from the Data Warehouse UI, a default Database Catalog is created (format: environment_name-default). You can add additional default Database Catalogs if you want a standalone data warehouse without any data from the tables that are in the environment.

The default Database Catalog activated from the Data Warehouse UI shares the HMS metastore database with HMS instance in the Data Hub cluster. You can access any objects or data sets created in the Data Mart or the Data Engineering clusters from CDW Virtual Warehouses and vice versa. The Hue database contains saved queries and query history that are stored in the Database Catalog.

For security, privacy, and resource isolation, the data sets and queries, job history, and other artifacts of one tenant are isolated from other tenants. The following architecture diagram shows the shared Hive metastore database.

If your data is stored on cloud object storage, such as S3 or ABFS, access to the shared data is controlled on the database level for each tenant through a common Ranger repository. Tenants have dedicated databases, but can also share data with other tenants if required.