Verifying and validating if your data is migrated
You can use the SyncTable command with the --dryrun parameter to verify if the tables are in sync between your source and your destination clusters. The SyncTable --dryrun option makes this run of your SyncTable command as read-only.
The HashTable and SyncTable jobs compose a tool implemented as two map-reduce jobs that must be executed as individual steps. It is similar to the CopyTable tool, which can perform both partial or entire table data copy. Unlike CopyTable it only copies diverging data between target clusters, saving both network and computing resources during the copy procedure.
- The HashTable or SyncTable jobs are designed to operate on individual tables. If multiple tables need to be migrated, you must execute these jobs separately for each table.
- If the data in a table is modified either through ingestion or deletion on the
source or destination, the job reports mismatches. To narrow the scope of data
being checked, you can use the
--starttime
or--endtime
options. For more information, see the Hashtable reference guide section.