Technical metrics for Models

You can observe the operation of your models by using charts provided for technical metrics. These charts can help you determine if your models are under- or over-resourced, or are experiencing some problem.

To check the performance of your model, go to Models, click on the model name, and select the Monitoring tab. You can choose to monitor all replicas of the model, or choose a specific replica. You can also select the time and date range to display. Up to two weeks of data is retained.

This tab displays charts for the following technical metrics:

  • Requests per Second
  • Number of Requests
  • Number of Failed Requests
  • Model Response Time
  • All Model Replica CPU Usage
  • All Model Replica Memory Usage
  • Model Request & Response Size

All charts share a common time axis (the x axis), so it is easy to correlate CPU and memory usage with model response time or the number of failed requests, for example.