Configuring auto-scaling for Trino

Learn how to configure auto-scaling for Trino workers to ensure resources are available for query workloads. This ensures that your workload demand is met without wasting resources.

If your cluster is deployed on the OpenShift Container Platform and you want to utilize auto-suspend capabilities, you must install Kubernetes Event-driven Autoscaling (KEDA). KEDA enables dynamic scaling of containers, allowing them to scale up or down based on demand.

Follow the instructions for "Creating a Virtual Warehouse".
Select the "custom" size for the Virtual Warehouse.
From the Trino Specific Settings panel, select the Enable Auto Scaling checkbox to allow Trino workers to automatically scale up or down based on the workload.
Specify the Worker Count Range by setting the minimum and maximum number of Trino workers that may be required to handle your varying workload demands.
The minimum worker count that you can specify is 1 and the maximum worker count should be less than or equal to 100.
Specify the Scale Out Delay and Scale In Delay values.
- Scale Out Delay: Sets the length of time in seconds that the system waits before adding smore workers when it detects the average CPU utilization of Trino worker containers exceeding 70% or queries waiting in the queue to execute. The default value is set to 15 seconds.
- Scale In Delay: Sets the length of time in seconds that the system waits before it removes workers when it detects the average CPU utilization of Trino worker containers below 70%. The default value is set to 30 seconds.
Specify the Grace Period for Shutting Down Worker.
The graceful shutdown period defines the duration (in seconds) that a Trino worker waits during a graceful shutdown process. This allows the Trino workers to complete existing tasks before the process fully terminates. By default, this value is set to 300 seconds and the minimum value that you can specify is 60 seconds.
Click Create Virtual Warehouse.