Adding a new Virtual Warehouse

A Virtual Warehouse is an instance of compute resources that is equivalent to a cluster. You learn how to create a new Virtual Warehouse in Cloudera Data Warehouse (CDW) Public Cloud of a size that suits your use case.

A Virtual Warehouse provides access to the data in tables and views in the data lake that correlates to a specific Database Catalog. Virtual Warehouses can only look up the Database Catalog that they have been configured to access.

When you create a Virtual Warehouse, a cluster is created in your cloud provider account. This cluster has two buckets. One bucket is used for managed data and the other is used for external data.

Before you create a new Virtual Warehouse, determine the number of concurrent queries or users your Virtual Warehouse must serve during peak periods. This information helps you determine what size of Virtual Warehouse you need. Choose the size based on the number of executors you typically use for clusters in an on-premises deployment. Also consider the complexity of your queries and the size of the data sets that they access. Larger sized warehouses with more executors can cache more data, which enhances performance.

Virtual Warehouse sizes you can choose from:

Virtual Warehouse Size Number of Executors
Custom Enter a value between '1' and '100'

Required role: DWAdmin

  1. Log in to the CDP web interface and navigate to the Data Warehouse service.
  2. Click Virtual Warehouses.
  3. In Virtual Warehouses, click Add New.
  4. In New Virtual Warehouse, specify a Name, its Type (Hive or Impala), which Database Catalog to query.
  5. In AWS environments only, accept the default availability zone, or select an availability zone, such as us-east-1c.

    The default behavior is to randomly select an availability zone from the list of configured availability zones for the associated environment. In AWS environments, all compute resources run in this zone. Selection of the zone is not an option in Azure environments.

  6. Select User Groups that can access endpoints, keys and values for Tagging the Virtual Warehouse, and the Size.
  7. Configure auto-scaling thresholds, or accept the defaults, when you specify a Size.
  8. Select the Hive Image Version or the Impala Image Version version of the Cloudera Data Warehouse you want to use, or accept the default version (latest) at the top of the drop-down menu.
    The corresponding Hue image version is automatically selected. However, you can select a different image version from the Hue Image Version drop-down menu.

  9. Click Create to create the new Virtual Warehouse.