Accessing the Evaluations dashboard

Access the evaluations dashboard in Agent Studios to monitor workflow runs, analyze metrics, and investigate results for testing or auditing purposes.

You must deploy Agent Studio. For instructions, see Deploying Agent Studio using the ML Runtime Image .

Accessing Evaluations during testing

When a workflow is still running, it is displayed in the Evaluations table with an In Progress status. The UI automatically refreshes once the run completes, making the results available without requiring a manual page reload.
  1. In the Cloudera console, click the Cloudera AI tile.

    The Cloudera AI Workbenches page is displayed.

  2. Click on the name of the workbench.

    The workbench Home page is displayed.

  3. Click Projects, and then click New Project to create a new project.

    In the left navigation pane, the new AI Studios option is displayed.

  4. Click AI Studios.
  5. Navigate to the Actions menu.
  6. Select Test Workflow.

    The testing interface is displayed.

  7. In the testing interface, go to the Evaluations tab to view the historical and current run data.
    Figure 1. Evaluations tab of a workflow in the Agent Studio UI

Accessing Evaluations for a deployed workflow

  1. In the Cloudera console, click the Cloudera AI tile.

    The Cloudera AI Workbenches page is displayed.

  2. Click on the name of the workbench.

    The workbench Home page is displayed.

  3. Click Projects, and then click New Project to create a new project.

    In the left navigation pane, the new AI Studios option is displayed.

  4. Click AI Studios.
  5. Open a deployed workflow.
  6. Enter your parameters, for example, Company Name, and click Run Workflow.
  7. When the run completes, click the Evaluations tab to view the historical and current run data.

Navigating the dashboard

The interface uses the following hierarchical drill-down approach to help you quickly investigate issues:
  • Runs Table – View all historical and current runs in a centralized table for fast scanning and comparison.
  • Metrics View – Analyze performance through Automatic and LLM-based Evaluators with standardized PASS/FAIL labels.
  • Drill-Down Detail – Click any metric row to navigate into detailed results. This allows you to trace exactly what failed and identify the specific span or step responsible. You can use the Back button to navigate between drill-down levels.