Key features of Cloudera Data Warehouse Public Cloud

The Cloudera Data Warehouse (CDW) service provides data warehouses and data marts that are automatically configured and isolated. CDW optimizes existing workloads when you move to the cloud. You scale resources up and down to meet your demands, and save costs by suspending and resuming resources automatically. Data warehouses and data marts comply with your data lake security.

Automatically configured and isolated

Each data warehouse and data mart can be automatically configured for you by Cloudera Data Warehouse service, but you can adjust some settings to suit your needs. Individual warehouses and data marts are completely isolated, ensuring that the right users have access to only their data and eliminating the problem of "noisy neighbors." Noisy neighbors are workloads that monopolize system resources and interfere with the queries from other tenants. With Cloudera Data Warehouse, you can easily offload noisy neighbor workloads to their own Virtual Warehouse instance so other tenants have access to enough compute resources for their workloads to complete and meet their SLAs.

This capability to isolate individual warehouses and data marts is equally useful for "VIP workloads." VIP workloads are crucial workloads that must have resources to complete immediately and as quickly as possible without waiting in a queue. You can run these VIP workloads in their own warehouse or data mart to ensure they get the resources they need to complete as soon as possible.

Optimized for your workloads

Data warehouses and data marts are automatically optimized for your workloads. This includes pre-configuring the software and creating the different caching layers, which means you do not need to engage in complex capacity planning or tuning. You create a Virtual Warehouse that specifies a SQL engine:

  • Hive for data warehouses that support complex reports and enterprise dashboards.
  • Impala for data marts that support interactive, ad-hoc analysis.

You choose the Virtual Warehouse instance size and adjust thresholds for auto-scaling.


Auto-scaling enables both scaling up and scaling down of Virtual Warehouse instances so they can meet your varying workload demands and save costs on cloud resources when they are not needed.

Auto-scaling provides the following benefits:

  • Service availability: Clusters are ready to accept queries "24 x 7."
  • Auto-scaling based on query wait-time: Queries start executing within the number of seconds that you specify and cluster resources are added or shut down to meet demand.
  • Auto-scaling based on number of concurrent queries running on the system: "Infinite scaling" means that the number of concurrent queries can go from 10 to 100 in minutes.
  • Cost guarantee: You can configure auto-scaling upper limits, which determine how large a compute cluster can grow. Since compute costs increase as cluster size increases, having a way to configure upper limits gives administrators a method to stay within a budget.

Auto-suspend and resume

You have the capability to set an AutoSuspend Timeout when creating a Virtual Warehouse. This sets the maximum time a Virtual Warehouse idles before shutting down. For example, if you set this to 60 seconds, then if the Virtual Warehouse is idle for 60 seconds, it suspends itself so you do not have to pay for unused compute resources. The first time a new query is run against an auto-suspend Virtual Warehouse, it restarts. This feature helps you maintain a tight control on your cloud spend while ensuring availability to run your workloads.

Security compliance

Your Database Catalogs and Virtual Warehouses automatically inherit the same security restrictions that are applicable to your CDP environment. There is no need to specify the security setup again for each Database Catalog or Virtual Warehouse. A link to information about security in CDP is provided at the bottom of this page. It discusses integration with Apache Knox and your LDAP provider which uses FreeIPA Identity Management.

The following security controls are inherited from your CDP environment:

  • Authentication: Ensures that all users have proven their identity before accessing the Cloudera Data Warehouse service or any created Database Catalogs or Virtual Warehouses.
  • Authorization: Ensures that only users who have been granted adequate permissions are able to access the Cloudera Data Warehouse service and the data stored in the tables.
  • Dynamic column masking: If rules are set up to mask certain columns when queries run, based on the user executing the query, then these rules also apply to queries executed in the Virtual Warehouses.
  • Row-level filtering: If rules are set up to filter certain rows from being returned in the query results, based on the user executing the query, then these same rules also apply to queries executed in the Virtual Warehouses.