Overview

This page describes Storage Transfer Service.

Other Google Cloud transfer options include:

What is Storage Transfer Service?

Storage Transfer Service is a product that enables you to:

  • Move or backup data to a Cloud Storage bucket either from other cloud storage providers or from your on-premises storage.

  • Move data from one Cloud Storage bucket to another, so that it is available to different groups of users or applications.

  • Periodically move data as part of a data processing pipeline or analytical workflow.

Storage Transfer Service provides options that make data transfers and synchronization easier. For example, you can:

  • Schedule one-time transfer operations or recurring transfer operations.

  • Delete existing objects in the destination bucket if they don't have a corresponding object in the source.

  • Delete data source objects after transferring them.

  • Schedule periodic synchronization from a data source to a data sink with advanced filters based on file creation dates, file-names, and the times of day you prefer to import data.

Storage Transfer Service does the following by default:

  • Storage Transfer Service copies a file from the data source if the file doesn't exist in the data sink or if it differs between the version in the source and the sink.

  • Retains files in the source after the transfer operation.

  • Uses TLS encryption for HTTPs connections. The only exception is if you specify an HTTP URL for a URL list transfer.

Permissions and role requirements for Storage Transfer Service

Storage Transfer Service uses Cloud Identity and Access Management to control and manage access. For more information about Cloud IAM, see Cloud IAM Overview.

To use Storage Transfer Service, you must be granted at least one of the following Cloud IAM roles, depending on the types of duties that you are performing:

Access Type IAM roles
Full access Storage Transfer Admin
Submitting transfers Storage Transfer User
Viewing or listing transfer jobs and operations Storage Transfer Viewer

The project you use to create a transfer job doesn't have to be associated with the buckets that act as a data source or data sink, but additional permissions are required to configure and use data sources and data sinks.

For more information about Storage Transfer Service roles and permissions, see Configuring access to data sources and sinks.

Available interfaces

There are a number of ways that you can work with Storage Transfer Service:

  • Use Google Cloud Console to create and manage transfer jobs. This is often the easiest and quickest way to start using Storage Transfer Service. For more information, see Creating and managing transfers with Console.

  • Use REST APIs to work directly with Storage Transfer Service API. See Creating a Storage Transfer Service client for more information about enabling the API and obtaining authentication tokens to use your requests.

Data integrity

Storage Transfer Service uses metadata available from the source storage system, such as checksums and file sizes, to ensure that data written to Cloud Storage is the same data read from the source.

When checksum metadata is available

If the checksum metadata on the source storage system indicates that the data Storage Transfer Service received doesn't match the source data, Storage Transfer Service records a failure for the transfer operation. Examples of storage systems that include checksum metadata include most Amazon Simple Storage Service (Amazon S3) and Microsoft Azure Blob Storage objects (with some exceptions) and HTTP transfers (checksum metadata are provided by the user).

When checksum metadata is unavailable, but agents can run near the source

If checksum metadata isn't available from the underlying source storage system but agents can be run locally near the source storage system, Storage Transfer Service attempts to read the source data and compute a checksum before sending the data to Cloud Storage. This occurs for Transfer service for on-premises data when moving data from file systems to Cloud Storage.

When checksum metadata is unavailable, and agents can't run near the source

If checksum metadata isn't available from the underlying source storage system, and agents can't be run locally near the source storage system, Storage Transfer Service can't compute a checksum until the data arrives in Cloud Storage. In this scenario, Storage Transfer Service copies the data but can't perform end-to-end data integrity checks to confirm that the data received is the same as the source data. Instead, Storage Transfer Service attempts a "best effort" approach by using available metadata, such as file size, to validate that the file copied to Cloud Storage matches the source file.

For example, Storage Transfer Service uses file sizes to validate data for:

After transfer checks

After your transfer is complete, we recommend performing additional data integrity checks to validate:

  • The correct version of the files are copied, for files that change at the source.
  • The correct set and number of files are copied, to verify that you've set up the transfer jobs correctly.
  • That files were copied correctly, by verifying the metadata on the files, such as file checksums, file size, and so forth.

Should you use gsutil or Storage Transfer Service?

The gsutil command-line tool also enables you to transfer data between Cloud Storage and other locations. While you can use gsutil to work with Amazon S3 buckets and transfer data from Amazon S3 to Cloud Storage, Storage Transfer Service is recommended for this use case.

Follow these rules of thumb when deciding whether to use gsutil or Storage Transfer Service:

Transfer scenario Recommendation
Transferring from another cloud storage provider Use Storage Transfer Service
Transferring less than 1 TB from on-premises Use gsutil
Transferring more than 1 TB from on-premises Use Transfer service for on-premises data

Use this guidance as a starting point. The specific details of your transfer scenario will also help you determine which tool is more appropriate.

Service Level Agreement

Storage Transfer Service currently does not provide an SLA, and some performance fluctuations may occur. For example, we do not provide SLAs for transfer performance or latency.