Tuning Impala data marts

When you tune Impala data marts, you set the minimum and maximum nodes for the data mart cluster and you can choose the mode of auto-scaling that is appropriate for your workloads.

  1. Log in to the CDP web interface and navigate to the Data Warehouse service.
  2. In the Data Warehouse service, navigate to the Overview page.
  3. On the Overview page under Virtual Warehouses, click the edit icon for a data mart:

  4. The next page provides properties that you can adjust to tune auto-scaling for your data warehouse:
    1. Adjust the minimum number of nodes or the maximum of nodes as needed:

      Setting and adjusting the minimum and maximum number of nodes per data mart is very similar to setting the number of nodes for on-premises clusters. Keep in mind the number of concurrent queries, the complexity of queries, and the volume of queries in your workloads to determine the appropriate number of nodes to set on each Virtual Warehouse instance.

  5. Select the Autoscale Mode by moving the slider:

    Choose from the following Autoscale Mode settings:

    • Conservative: This mode causes the data mart to auto-scale up approximately 60 seconds after maximum utilization of resources is reached. Then when demand decreases, it immediately auto-scales down.
    • Balanced: In this mode, the data mart auto-scales up approximately 30 seconds after maximum utilization of resources is reached and after approximately 30 seconds after demand decreases, it auto-scales down.
    • Aggressive: For this mode the data mart immediately auto-scales up when maximum utilization of resources is reached and it auto-scales down approximately 60 seconds after demand decreases.
  6. Click Save & Restart in the upper right of the page to save your changes.