Restore tables from backups
You can use the KuduRestore Spark job to restore one or more Kudu
tables. For each backed up table, the KuduRestore job restores the full backup
and each associated incremental backup until the full table state is restored.
Restoring the full series of full and incremental backups is possible because the backups are
linked via the from_ms and to_ms fields in the backup metadata.
By default the restore job will create tables with the same name as the table that was backed up.
If you want to side-load the tables without affecting the existing tables, you can pass the
--tableSuffix flag to append a suffix to each restored table.
Following are the common flags that are used when restoring the tables:
--rootPath: The root path to the backup data. Accepts any Spark-compatible path.See Backup directory structure for the directory structure used in the
rootPath.--kuduMasterAddresses: Comma-separated addresses of Kudu masters. The default value islocalhost.--createTables: If set to true, the restore process creates the tables. Set it tofalseif the target tables already exist. The default value istrue.--tableSuffix: If set, it adds a suffix to the restored table names. Only used whencreateTablesistrue.--timestampMs: A UNIX timestamp in milliseconds that defines the latest time to use when selecting restore candidates. The default isSystem.currentTimeMillis().<table>…: A list of tables to restore.
Following is an example of a
KuduRestore job execution which restores the
tables foo and bar from the HDFS directory
kudu-backups:spark-submit --class org.apache.kudu.backup.KuduRestore kudu-backup2_2.11-1.12.0.jar \
--kuduMasterAddresses master1-host,master-2-host,master-3-host \
--rootPath hdfs:///kudu-backups \
foo bar