Optimized queries with Dashboards Archive table

The Dashboards Archive feature introduces a mechanism from 1.5.5 CHF1 to optimize performance, manage data retention, and clarify the distinction between active and historical data in the dashboards system.

By archiving older records into a dedicated dashboards_archive table, the size of the primary dashboards table is significantly reduced. This leads to faster query execution for active dashboards, enhancing overall system performance. This improvement is particularly beneficial if you manage large volumes of Apache Spark jobs, for example.

Archived records are stored in the dashboards_archive table, ensuring that historical data is not lost and remains accessible when needed. Active dashboards are retained in the dashboards table. This separation creates a clear distinction between current and historical data, simplifying data management.

APIs referencing the dashboards table do not automatically query the dashboards_archive table. You must run reports before initiating the archive process to ensure all required data is captured.

Features or services that rely on the dashboards table for metrics, such as active dashboards, running jobs, or user activity, do not include archived records in their calculations, so metrics reflect only the current state of active dashboards.

The following APIv2 endpoints do not account for archived records:
  • batchListProjects
  • getApplication
  • getExperiment
  • getExperimentRun
  • getExperimentRunMetrics
  • getJob
  • getJobRun
  • getProject
  • listAllExperiments
  • listApplications
  • listExperimentRuns
  • listExperiments
  • listJobRuns
  • listJobs
  • listProjects