Monitoring Hue and Data Visualization database backup

The automatic backup procedure saves the Data Visualization database contents to the configured logs or data folders based on availability.

Hue

During the manual or automatic Hue database backup operation it is critical to block any traffic to the running Hue services. If you cannot bring down the cluster, Cloudera recommends you disable end user access to the cluster endpoints. Failing to do so results in errors in addition to existing key constraints and other issues.

Automatic Hue backup

Automatic backup of Hue extracts the saved query and query history and loads them to the new cluster.

Monitoring Hue backup

The backup starts a job to load the database dump file, but does not wait for the job to complete. If you have a large database, the job can take up to an hour to complete. Ensure you allow enough time for the job to succeed.

To monitor Hue backup, log into the cluster and monitor the job status under the database catalog namespace.

$ kubectl get jobs -n <database catalog id>

The output that shows the hue-backup job looks something like this:

$ kubectl get jobs -n warehouse-1692037411-96hk
  NAME                                              COMPLETIONS   DURATION   AGE
  hue-backup-ede2b8bd-1d53-4d23-a0f9-87d8ec658f74  1/1           11s        113s
  hue-query-processor-db-create-job                 1/1           8s         42h

Data Visualization

The automatic backup procedure saves the Data Visualization database contents to the configured logs or data folders based on availability.

Automatic backup

Automatic backup of Data Visualization extracts the dashboards, tables and connections. Make sure to wait for the job to finish before destroying the cluster.

Monitoring backup of Data Visualization

The backup starts a job to create the database dump file, but it does not wait for it to complete. In case your database size is large, it can take up to 20 minutes for the job to complete. Make sure to leave enough time for the job to succeed. To monitor Data Visualization backup, you can log into the cluster and see the job status under the viz namespace using the following command to extracts the dashboards, tables and connections:

$ kubectl get jobs -n <data visualization id>            

The output looks something like this:

$ kubectl get jobs -n viz-1692216942-fc2g
  NAME                                              COMPLETIONS   DURATION   AGE
  viz-backup-d874515a-be7e-4902-ac75-269c14f9580c   1/1           3m3s       10m
  viz-webapp-vizdb-create-job                       1/1           57s        99m