Monitoring Hue and Data Visualization restoration

The restore process is designed to be an idempotent process, it can be restarted as many times as you want. In case the environment is activated and healthy, the restore operation can be run multiple times to restore the Virtual Warehouse and Data Visualization objects.

Hue automatic restoration

Restoration of the Hue database occurs only if no Virtual Warehouses arevattached to the particular Database Catalog. In the CDP CLI response to "hueRestorePlans" section of the dw restore-cluster command is the "LoadOrOverwrite" action. This action denotes that the Hue restore operation will run. In any other case the response will be "Skip".

To retry a failed Hue restore operation using the CDP CLI dw restore-cluster command, you must first discommission Virtual Warehouses.

Cloudera can only ensure the data-correctness if there are no running Virtual Warehouses attached to the Database catalog. During the automated restore process the creation of the Virtual Warehouses is deferred until the Hue restore completes. Depending on the size of the Hue backup, this process can take up to 30 minutes. During this time the CDW UI indicates that the Environment and the Database Catalog is created, but no Virtual Warehouses appear in the UI.

Learn more about how to monitor the Hue restoration process in the following section.

Monitoring Hue restoration

The restoration starts a job to load the database dump file, but does not wait for the job to complete. If you have a large database, the job can take up to an hour to complete. Ensure you allow enough time for the job to succeed.

To monitor Hue restoration, log into the cluster and monitor the job status under the database catalog namespace.

$ kubectl get jobs -n <database catalog id>

The output that shows the hue-restore job looks something like this:

$ kubectl get jobs -n warehouse-1692037411-96hk
     NAME                                              COMPLETIONS   DURATION   AGE
     hue-restore-ede2b8bd-1d53-4d23-a0f9-87d8ec658f74   1/1           11s        113s
     hue-query-processor-db-create-job                 1/1           8s         42h

You can monitor the Hue restoration process using the CDP CLI dw list-events --operation-id <operation-id> command. Use the operation id acquired with the dw restore-cluster command to see events belonging to the long running restore process.

Monitor the "HueRestoreWait" events. For example:
{
            "operationId": "5cd90dab-5b15-4de9-b70f-ddab0dfa0c10",
            "event": "HueRestoreWait",
            "message": "{\"type\":\"info\",\"message\":\"Waiting Hue data restore job to finish\",\"error\":null}",
            "timestamp": "2024-03-21T09:15:11+00:00"
        },
        {
            "operationId": "5cd90dab-5b15-4de9-b70f-ddab0dfa0c10",
            "event": "HueRestoreWait",
            "message": "{\"type\":\"info\",\"message\":\"Hue Data restore still processing: job status, active: 1, failed: 0, succeeded: 0\",\"error\":null}",
            "timestamp": "2024-03-21T09:15:26+00:00"
        },

As Hue is restored, the following HueRestoreWait message will appear: "active: 1, failed: 0, succeeded: 0"

Upon the restoration of Hue, the following message appears: "Hue data restore job finished"

Data Visualization automatic restoration

If a Data Visualization object is not present on the cluster, but the backup file contains it, it will be restored to the cluster. In case such an entity is already deployed, no changes or configuration updates will take place.

Automatic restoration of Data Visualization loads the dashboards, tables, and connections to the new applications. Make sure to wait for the job to finish before destroying the cluster.

To monitor restoration of Data Visualization, you can log into the cluster and see the job status under the viz namespace using the following command.
$ kubectl get jobs -n <data visualization id>

The output will be similar to this, the viz-restore job shows the status.

$ kubectl get jobs -n viz-1692216942-fc2g
     NAME                                              COMPLETIONS   DURATION   AGE
     viz-restore-d874515a-be7e-4902-ac75-269c14f9580c   1/1           3m3s       10m
     viz-webapp-vizdb-create-job                        1/1           57s        99m

The job logs contain the upload path where the backup file has been downloaded from.

Automatic restoration of Data Visualization

Automatic backup and restore for Data Visualization extracts the dashboards, tables and connections. Make sure to wait for the job to finish before destroying the cluster. In the event of a restoration failure, try manually restoring Data Visualization.

Monitoring Data Visualization restoration

To monitor the restoration of Data Visualization, you can log into the cluster and see the job status under the viz namespace using the following command.

$ kubectl get jobs -n <data visualization id>            

The output looks something like this:

$ kubectl get jobs -n viz-1692216942-fc2g
  NAME                                              COMPLETIONS   DURATION   AGE
  viz-restore-d874515a-be7e-4902-ac75-269c14f9580c   1/1           3m3s       10m
  viz-webapp-vizdb-create-job                       1/1           57s        99m