Impala warehouse configuration options
You can configure a number of parameters while configuring an Impala Virtual Warehouse
in Cloudera Data Warehouse (CDW) on Private Cloud. These parameters help you to provision
high-performance Impala Virtual Warehouses and enable you to optimally utilize compute
Workload Aware Auto-Scaling in Impala (Preview) Workload Aware Auto-Scaling (WAAS) is available as a technical preview. WAAS allocates Impala Virtual Warehouse resources based on the workload that is running. Spill Impala queries to external storage Cloudera Data Warehouse (CDW) on Private Cloud enables you to write intermediate files during large sorts, joins, aggregations, or analytic function operations to a remote scratch space on HDFS or Ozone. Enabling Unified Analytics for Impala in CDW Private Cloud You can enable Unified Analytics while creating an Impala Virtual Warehouse. Doing this provisions a Virtual Warehouse that can automatically redirect queries to an appropriate SQL engine (either Hive or Impala) depending on the nature of the query. Auto-suspend Virtual Warehouses AutoSuspend Timeout is an option you can set while creating a Virtual Warehouse. Understand how the auto-suspend timeout option works along with the auto-scaling settings on a Virtual Warehouse. Configuring Impala coordinator shutdown To optimize resource utilization, you need to know how to configure Impala coordinators to automatically shutdown during idle periods. You need to know how to prevent unnecessary restarts. Monitoring programs that periodically connect to Impala can cause unnecessary restarts. Auto-scaling Impala Virtual Warehouses Your Impala Virtual Warehouse in Cloudera Data Warehouse (CDW) Private Cloud has an auto-scaler process that works with coordinators and executors to make resources available for queued queries. This ensures that workload demand is met without wasting cloud resources. Configuring Impala coordinator high availability A single Impala coordinator might not handle the number of concurrent queries you want to run or provide the memory your queries require. You can configure multiple active coordinators to resolve or mitigate these problems. You can change the number of active coordinators later. Impala pod configuration option in CDW Private Cloud Cloudera Data Warehouse (CDW) allocates standard resources to the warehouses that are suitable for most workloads. You can control the size of the Virtual Warehouse at the time of creation by choosing the number of nodes to be used. You can either select a default option or a custom option that you may have created. Configuring fe_service_threads in CDW Private Cloud The “fe_service_threads” configuration is used to specify the maximum number of concurrent client connections or threads allowed to serve client requests in an Impala Virtual Warehouse. The default value of the “fe_service_threads” configuration is 96. You can change the value of this configuration from the Impala Coordinator flagfile configurations in the Cloudera Data Warehouse (CDW) UI.