Running onboarding script

This section helps you to run the onboarding script located in the configure-cldr-metric-storage.sh file to start the monitoring pods.

You can dowload the script here: configure-cldr-metric-storage.sh.

  1. For a non-GPU-based cluster, this is the reference command:
    sh ./configure-cldr-metric-storage.sh --metrics-ingestion-url "[***metrics-ingestion-url***]" --dbus-api-url "[***dbus-url***]" --credentials-file "[***access-credentials***]" --controlplane-namespace [***control-plane-namespace***]
  2. For a GPU-based cluster, this is the reference command:
    sh ./configure-cldr-metric-storage.sh --metrics-ingestion-url "[***metrics-ingestion-url***]" --dbus-api-url "[***dbus-url***]" --credentials-file "[***access-credentials***]" --controlplane-namespace [***control-plane-namespace***] --dcgm-exporter-enable "true" --dcgm-exporter-enable-privileged "true" --dcgm-exporter-runtime-class nvidia

After completing these steps, the metrics collector, workload collector, and observability agent pods will start running in the observability namespace. These pods collect Real-Time Monitoring metrics and Cloudera AI workload logs.

Run several Cloudera AI workloads and then search for your workbench name on the Cloudera Observability page in the AI workbench environment. You can use the search feature to verify that the environment is correctly displaying your workload data.