Transfer for on-premises job details

This document describes more details about how Transfer service for on-premises data functions. Specifically, we cover incremental transfers and data consistency, what those mean, and how they work.

Starting incremental transfers

Transfer service for on-premises data begins all transfers by computing the data present at the source and destination to determine which source files are new, updated, or deleted since the last transfer. We do this to reduce the amount of data we send from your machines, to use bandwidth effectively, and to reduce transfer times.

To detect whether a file has changed, we use an algorithm similar to gsutil rsync: we check the last modified time and size of the source file, and compare that to the last modified time and size recorded when the file was last copied. When we detect a new or changed file, we copy the entire file to its destination. For more information about file freshness, see Data consistency details.

By default we detect, but do not act on, files deleted on the source. If you choose the sync option Delete destination files that aren't also in the source when creating or editing, your transfer will delete the corresponding object at the destination.

If you choose the sync option Delete destination files that aren't also in the source, files that are accidentally deleted at the source are also deleted at the destination. To prevent data loss from accidental deletions, we recommend enabling object versioning in your destination bucket if you choose to use this option. Then, if you delete a file accidentally, you can restore your objects in Cloud Storage to an older version.

Data consistency details

A successful transfer operation will transfer all source files which existed and were not modified during the operation's entire running time. Source files that were created, updated, or deleted during a transfer may or may not have those changes reflected in the destination data set.

If it's important to capture changes that occurred during a transfer, you can either re-run the transfer, or set the source file system as read-only while the operation is running.