You can manually restore the Hue database instance that you backed up. You can use this procedure for manually restoring Hue when your Hue backup is 6GB or larger.
In the manual backup of Hue, you followed steps to
dump the entire Hue database. In the following procedure, you move the Hue backup file
from the dump to the new CDW environment.
-
Connect to Hue pod on new Hive/Impala Virtual Warehouse cluster.
$ kubectl exec -it huebackend-0 -n <new Virtual Warehouse ID> -c hue – /bin/bash
-
Clean the Hue database by running the flush command from the hue pod
./build/env/bin/hue flush
-
Split the json into smaller chunks.
HUE_BACKUP_ORIG_FILE=data.json # Change the the correct path
HUE_BACKUP_CHUNKS_DIR=hue_backup_parts # Change if needed
mkdir -p ${HUE_BACKUP_CHUNKS_DIR}
rm -rf ${HUE_BACKUP_CHUNKS_DIR}/part* | true
jq -cn --stream 'fromstream(1|truncate_stream(inputs))'
${HUE_BACKUP_ORIG_FILE} | split -l 5000 -a 4 -d -
${HUE_BACKUP_CHUNKS_DIR}/part
find ${HUE_BACKUP_CHUNKS_DIR}/part* -maxdepth 1 -type f ! -name "*.*" -exec sh -c 'jq --slurp "." "${0}" | gzip > "${0}.json.gz"' {} \;
ls -alh ${HUE_BACKUP_CHUNKS_DIR}
tar cvzf ${HUE_BACKUP_ORIG_FILE}.tar.gz
${HUE_BACKUP_CHUNKS_DIR}/part*.json.gz
echo "Generated the chunked backup file"
ls -alh ${HUE_BACKUP_ORIG_FILE}.tar.gz # This is our final output file
-
Import the chunked JSON:
Move the tarball of backup chunks to the cluster pod.
Extract the tarball to a directory, for example /tmp/hue_backup_parts.
-
Run hue loaddata command on the pod.
/opt/hive/build/env/bin/hue loaddata --verbosity 3 --exclude auth.permission --exclude contenttypes --ignorenonexistent $(find /tmp/hue_backup_parts -type f -name '*.json.gz') # UPDATE THE PATH TO THE EXTRACTION DIRECTORY FROM THE PREVIOUS STEP