Learn how to configure auto-scaling for Trino workers to ensure resources are
available for query workloads. This ensures that your workload demand is met without wasting
resources.
If your cluster is deployed on the OpenShift Container Platform and you
want to utilize auto-suspend capabilities, you must install Kubernetes Event-driven
Autoscaling (KEDA). KEDA enables dynamic scaling of containers, allowing them to
scale up or down based on demand.
Follow the instructions for "Creating a Virtual Warehouse".
Select the "custom" size for the Virtual Warehouse.
From the Trino Specific Settings panel, select the
Enable Auto Scaling checkbox to allow Trino workers
to automatically scale up or down based on the workload.
Specify the Worker Count Range by setting the minimum
and maximum number of Trino workers that may be required to handle your varying
workload demands.
The minimum worker count that you can specify is 1 and the maximum worker
count should be less than or equal to 100.
Specify the Scale Out Delay and Scale In
Delay values.
Scale Out Delay: Sets the length of time in seconds that the system
waits before adding smore workers when it detects the average CPU
utilization of Trino worker containers exceeding 70% or queries waiting
in the queue to execute. The default value is set to 15 seconds.
Scale In Delay: Sets the length of time in seconds that the system waits
before it removes workers when it detects the average CPU utilization of
Trino worker containers below 70%. The default value is set to 30
seconds.
Specify the Grace Period for Shutting Down Worker.
The graceful shutdown period defines the duration (in seconds) that a Trino
worker waits during a graceful shutdown process. This allows the Trino workers
to complete existing tasks before the process fully terminates. By default, this
value is set to 300 seconds and the minimum value that you can specify is 60
seconds.