Impala warehouse configuration options
You can configure a number of parameters while configuring an Impala Virtual Warehouse
in Cloudera Data Warehouse
on premises . These parameters help you to provision
high-performance Impala Virtual Warehouses and enable you to optimally utilize
compute resources.
Workload Aware Auto-Scaling in Impala Workload Aware Auto-Scaling (WAAS) allocates Impala Virtual Warehouse resources based on the workload that is running.Spill Impala queries to external storage Cloudera Data Warehouse on on premises enables you to write intermediate files during large sorts, joins, aggregations, or analytic function operations to a remote scratch space on HDFS or Ozone.Enabling Unified Analytics for Impala Virtual Warehouses in Cloudera Data Warehouse on premises You can enable Unified Analytics while creating an Impala Virtual Warehouse. Doing this provisions a Virtual Warehouse that can automatically redirect queries to an appropriate SQL engine (either Hive or Impala) depending on the nature of the query.Auto-suspend Virtual Warehouses AutoSuspend Timeout is an option you can set while creating a Virtual Warehouse. Understand how the auto-suspend timeout option works along with the auto-scaling settings on a Virtual Warehouse.Configuring Impala coordinator shutdown To optimize resource utilization , you need to know how to configure Impala coordinators to automatically shutdown during idle periods. You need to know how to prevent unnecessary restarts. Monitoring programs that periodically connect to Impala can cause unnecessary restarts.Auto-scaling Impala Virtual Warehouses Your Impala Virtual Warehouse in Cloudera Data Warehouse Private Cloud has an auto-scaler process that works with coordinators and executors to make resources available for queued queries. This ensures that workload demand is met without wasting cloud resources.About the Impala Autoscaling Dashboard The Impala Autoscaling Dashboard helps you to visualize and understand how the Impala Virtual Warehouse autoscales, how the queries are routed to an executor group set, how the queries affect the provisioning of the executors, and resource utilization over a specified time window. You can use this dashboard to monitor the regular as well as workload-aware autoscaling.Configuring Impala coordinator high availability A single Impala coordinator might not handle the number of concurrent queries you want to run or provide the memory your queries require. You can configure multiple active coordinators to resolve or mitigate these problems. You can change the number of active coordinators later.Configuring fe_service_threads in Cloudera Data Warehouse on premises The “fe_service_threads” configuration is used to specify the maximum number of concurrent client connections or threads allowed to serve client requests in an Impala Virtual Warehouse. The default value of the “fe_service_threads” configuration is 128. You can change the value of this configuration from the Impala Coordinator flagfile configurations in the Cloudera Data Warehouse UI.