Auto-scaling in public cloud environments

You can scale up or scale down the resources allocated to your database automatically by ensuring that an auto scaling event is triggered if pre-defined acceptable latency policy is not met. This ensures predictable database performance.

An auto-scaling event is triggered if the rolling average value of the latency in the last one hour is not equal to a pre-defined acceptable latency.

Latency is defined as the time between when the database received the request and sent the response after processing. It is monitored at the cluster level and only the 95th percentile of the latency is considered in the calculation. Auto-scaling does the following to ensure that you have predictable database performance when scaling up or down:

  • Gradual increase and decrease to the number of nodes (cluster does not shrink or expand drastically)

  • Sufficient cooldown period between scaling up or down to avoid frequent change in cluster size

You can see the results of scale up or scale down in the Databases > Charts page. This page shows you graphs for concurrent clients and RPC latency that provides information about how COD us scaling up or down.

The latency metric responds to the increase or decrease in the nodes, but there may be cases where increasing or decreasing the number of nodes does not have an effect because of the kind of workload that is running or if there are limited regions for a table.