Flow metrics based auto-scaling

Flow metrics based auto-scaling predicts future load on a connection to trigger scaling decisions.

In contrast to CPU based auto-scaling where scaling decisions are made based on infrastructure utilization, flow metrics based auto-scaling works by predicting backpressure on a source connection queue. Source connections are connections attached to processors where data is first introduced into the flow. Backpressure occurs when such a connection becomes 100% full. Source connections are automatically detected through static analysis of the flow definition. Queue percentage of each source connection is tracked and a linear regression is performed on the histogram for each to predict the fullness of the queue in 20 minutes. Scaling up occurs when that prediction is greater than 80% for 5 minutes.

Flow metrics based auto-scaling can be used to detect the need to scale based on flow performance metrics. In cases where queues fill up without CPU utilization going over the threshold, flow metrics based auto-scaling still triggers a scaling operation.