The backup procedure automatically saved the Hue database content and placed the content into the configured logs or data folders based on availability. Using the saved content, the restore process loads the data for the new Hue deployments.
Automatic backup and restoration of Hue extracts the saved query and query history and loads them to the new cluster. In case the restore operation is called multiple times, the Hue database restore job will be run. This job will overwrite the current query history and saved queries with the contents of the backup.
Monitoring restoration execution
The restoration starts a job to load the database dump file, but does not wait for the job to complete. If you have a large database, the job can take up to an hour to complete. Ensure you allow enough time for the job to succeed.
To monitor execution, log into the cluster and monitor the job status under the database catalog namespace.
$ kubectl get jobs -n <database catalog id>
The output that shows the hue-restore job looks something like this:
$ kubectl get jobs -n warehouse-1692037411-96hk NAME COMPLETIONS DURATION AGE hue-restore-ede2b8bd-1d53-4d23-a0f9-87d8ec658f74 1/1 11s 113s hue-query-processor-db-create-job 1/1 8s 42h
You can manually restore the Hue database instance that you backed up.
In the manual backup of Hue, you followed steps to dump the entire Hue database. In the following procedure, you move the Hue backup file from the dump to the new CDW environment.
(Optional) Edit the Hue statefulset Hue backup files that are larger than 3Gb
and require a significant amount of memory to restore the database contents.
Sufficient memory is not available for the Hue pod by default. To successfully load the data, you must increase the Hue pod memory limit.
(Optional) Check file sizes and configure the memory accordingly.
kubectl edit statefulset huebackend -n <new Virtual Warehouse ID>
Set the hue container memory limit to 24Gb to provide leeway for the load command.
name: hue resources: limits: memory: 24G requests: cpu: "1" memory: 8192M
Copy the Hue backup data to the hue pod on the new Virtual Warehouse cluster.
$ kubectl cp /tmp/data.json compute-1677087760-cj7q/huebackend-0:/tmp/data.json -c hue
Connect to Hue pod on new Hive/Impala Virtual Warehouse cluster.
$ kubectl exec -it huebackend-0 -n <new Virtual Warehouse ID> -c hue – /bin/bash
Load data to Hue onto the new Virtual Warehouse cluster.
$ ./build/env/bin/hue loaddata --exclude auth.permission --exclude contenttypes --ignorenonexistent /tmp/data.json
- Go to the Hue UI and verify the saved queries and query history are showing up in the new CDW environment. Also, verify that Hue is working.
If the Hue UI is not accessible, try the following workarounds.
- Kill the Hue frontend pod, providing the Hue frontend pod ID, for
$ kubectl get pods -n <new Virtual Warehouse ID> # the pod name $ kubectl delete pod <Hue frontend pod ID> -n <new Virtual Warehouse ID>
- Increase virtual warehouse memory if the database size is
## edit hue backend container to max 16GB (note: there are two containers: ## busybox and hue) $ kubectl edit sts huebackend -n <new Virtual Warehouse ID>
- Kill the Hue frontend pod, providing the Hue frontend pod ID, for example huefrontend-5bdc7bc7b8-8lpgj.
- After the data load is finished, revert the container memory limit back to the original 8192M setting.