Restoring tables from backups
You can use the KuduRestore
Spark job to restore one or more Kudu
tables. For each backed up table, the KuduRestore
job restores the full backup
and each associated incremental backup until the full table state is restored.
Restoring the full series of full and incremental backups is possible because the backups are
linked via the from_ms
and to_ms
fields in the backup metadata.
By default the restore job will create tables with the same name as the table that was backed up.
If you want to side-load the tables without affecting the existing tables, you can pass the
--tableSuffix
flag to append a suffix to each restored table.
Following are the common flags that are used when restoring the tables:
--rootPath
: The root path to the backup data. Accepts any Spark-compatible path.See Backup directory structure for the directory structure used in the
rootPath
.--kuduMasterAddresses
: Comma-separated addresses of Kudu masters. The default value islocalhost
.--createTables
: If set to true, the restore process creates the tables. Set it tofalse
if the target tables already exist. The default value istrue
.--tableSuffix
: If set, it adds a suffix to the restored table names. Only used whencreateTables
istrue
.--timestampMs
: A UNIX timestamp in milliseconds that defines the latest time to use when selecting restore candidates. The default isSystem.currentTimeMillis()
.<table>…
: A list of tables to restore.
Following is an example of a
KuduRestore
job execution which restores the
tables foo
and bar
from the HDFS directory
kudu-backups
:spark-submit --class org.apache.kudu.backup.KuduRestore kudu-backup2_2.11-1.12.0.jar \
--kuduMasterAddresses master1-host,master-2-host,master-3-host \
--rootPath hdfs:///kudu-backups \
foo bar