This page shows you how to create and start transfer jobs.
To see if your source and destination (also known as a sink) are supported by Storage Transfer Service, refer to Supported sources and sinks.
Agents and agent pools
Depending on your source and destination, you may need to create and configure an agent pool and install agents on a machine with access to your source or destination.
Transfers from Amazon S3, Microsoft Azure, URL lists, or Cloud Storage to Cloud Storage do not require agents and agent pools.
Transfers whose source and/or destination is a file system, or from S3-compatible storage, do require agents and agent pools. See Manage agent pools for instructions.
Before you begin
Before configuring your transfers, make sure you have configured access:
- For users and service accounts:
- To your source data and to your destination.
If you're using gcloud
commands,
install the gcloud CLI.
Create a transfer
Don't include sensitive information such as personally identifiable information (PII) or security data in your transfer job name. Resource names may be propagated to the names of other Google Cloud resources and may be exposed to Google-internal systems outside of your project.
Google Cloud console
Go to the Storage Transfer Service page in the Google Cloud console.
Click Create transfer job. The Create a transfer job page is displayed.
Choose a source:
Cloud Storage
Your user account must have storage.buckets.get permission to select source and destination buckets. Alternatively, you can type the name of the bucket directly. For more information, see Troubleshooting access.
Under Source type, select Cloud Storage.
Select your Destination type.
If your destination is Cloud Storage, select your Scheduling mode. Batch transfers execute on a one-time or scheduled basis. Event-driven transfers continuously monitor the source and transfer data when it's added or modified.
To configure an event-driven transfer, follow the instructions at Event-driven transfers.
Click Next step.
Select a bucket and (optionally) a folder in that bucket by doing one of the following:
Enter an existing Cloud Storage bucket name and path in the Bucket or folder field without the prefix
gs://
. For example,my-test-bucket/path/to/files
. To specify a Cloud Storage bucket from another project, type the name exactly into the Bucket name field.Select a list of existing buckets in your projects by clicking Browse, then selecting a bucket.
When you click Browse, you can select buckets in other projects by clicking the Project ID, then selecting the new Project ID and bucket.
To create a new bucket, click Create new bucket.
If this is an event-driven transfer, enter the Pub/Sub subscription name, which takes the following format:
projects/PROJECT_NAME/subscriptions/SUBSCRIPTION_ID
- Optionally, choose to
filter objects by prefix
or by last modified date. If you specified a folder as your source location, prefix filters
are relative to that folder. For example, if your source is
my-test-bucket/path/
, an include filter offile
includes all files starting withmy-test-bucket/path/file
. Click Next step.
Amazon S3
S3-compatible storage
Microsoft Azure Blob Storage
Under Source type, select Azure Blob Storage or Data Lake Storage Gen2.
Click Next step.
Specify the following:
Storage account name — the source Microsoft Azure Storage account name.
The storage account name is displayed in the Microsoft Azure Storage portal under All services > Storage > Storage accounts.
Container name — the Microsoft Azure Storage container name.
The container name is displayed in the Microsoft Azure Storage portal under Storage explorer > Blob containers.
Shared access signature (SAS) — the Microsoft Azure Storage SAS token created from a stored access policy. For more information, see Grant limited access to Azure Storage resources using shared access signatures (SAS).
The default expiration time for SAS tokens is 8 hours. When you create your SAS token, be sure to set a reasonable expiration time that enables you to successfully complete your transfer.
- Optionally, choose to
filter objects by prefix
or by last modified date. If you specified a folder as your source location, prefix filters
are relative to that folder. For example, if your source is
my-test-bucket/path/
, an include filter offile
includes all files starting withmy-test-bucket/path/file
. Click Next step.
File system
Under Source type, select POSIX file system.
Select your Destination type and click Next step.
Select an existing agent pool, or select Create agent pool and follow the instructions to create a new pool.
Specify the fully qualified path of the file system directory.
Click Next step.
HDFS
URL list
Under Source type, select URL list and click Next step.
Under URL of TSV file, provide the URL to a tab-separated values (TSV) file. See Creating a URL List for details about how to create the TSV file.
- Optionally, choose to
filter objects by prefix
or by last modified date. If you specified a folder as your source location, prefix filters
are relative to that folder. For example, if your source is
my-test-bucket/path/
, an include filter offile
includes all files starting withmy-test-bucket/path/file
. Click Next step.
Choose a destination:
Cloud Storage
In the Bucket or folder field, enter the destination bucket and (optionally) folder name, or click Browse to select a bucket from a list of existing buckets in your current project. To create a new bucket, click Create new bucket.
Click Next step.
Choose settings for the transfer job. Some options are only available for certain source/sink combinations.
In the Description field, enter a description of the transfer. As a best practice, enter a description that is meaningful and unique so that you can tell jobs apart.
Under Metadata options, choose to use the default options, or click View and select options to specify values for all supported metadata. See Metadata preservation for details.
Under When to overwrite, select one of the following:
If different: Overwrites destination files if the source file with the same name has different Etags or checksum values.
Always: Always overwrites destination files when the source file has the same name, even if they're identical.
Under When to delete, select one of the following:
Never: Never delete files from either the source or destination.
Delete file from source after they're transferred: Delete files from the source after they're transferred to the destination.
Delete files from destination if they're not also at source: If files in the destination Cloud Storage bucket aren't also in the source, then delete the files from the Cloud Storage bucket.
This option ensures that the destination Cloud Storage bucket exactly matches your source.
Under Notification options, select your Pub/Sub topic and which events to notify for. See Pub/Sub notifications for more details.
Click Next step.
File system
Select an existing agent pool, or select Create agent pool and follow the instructions to create a new pool.
Specify the fully qualified destination directory path.
Click Next step.
Choose your scheduling options:
From the Run once drop-down list, select one of the following:
Run once: Runs a single transfer, starting at a time that you select.
Run every day: Runs a transfer daily, starting at a time that you select.
You can enter an optional End date, or leave End date blank to run the transfer continually.
Run every week: Runs a transfer weekly, starting at a time that you select.
Run with custom frequency: Runs a transfer at a frequency that you select. You can choose to repeat the transfer at a regular interval of Hours, Days, or Weeks.
You can enter an optional End date, or leave End date blank to run the transfer continually.
From the Starting now drop-down list, select one of the following:
Starting now: Starts the transfer after you click Create.
Starting on: Starts the transfer on the date and time that you select. Click Calendar to display a calendar to select the start date.
To create your transfer job, click Create.
gcloud CLI
To create a new transfer job, use the gcloud transfer jobs create
command. Creating a new job initiates the specified transfer, unless a
schedule or --do-not-run
is specified.
gcloud transfer jobs create \
SOURCE DESTINATION
Where:
SOURCE is the data source for this transfer. The format for each source is:
- Cloud Storage:
gs://BUCKET_NAME
. To transfer from a specific folder, specifygs://BUCKET_NAME/FOLDER_PATH/
, including the trailing slash. - Amazon S3:
s3://BUCKET_NAME/FOLDER_PATH
- S3-compatible storage:
s3://BUCKET_NAME
. The bucket name is relative to the endpoint. For example, if your data resides athttps://us-east-1.example.com/folder1/bucket_a
, enters3://folder1/bucket_a
. - Microsoft Azure Storage:
https://myaccount.blob.core.windows.net/CONTAINER_NAME
- URL list:
https://PATH_TO_URL_LIST
orhttp://PATH_TO_URL_LIST
- POSIX file system:
posix:///PATH
. This must be an absolute path from the root of the agent host machine. - HDFS:
hdfs:///PATH
- Cloud Storage:
DESTINATION is one of:
- Cloud Storage:
gs://BUCKET_NAME
. To transfer into a specific directory, specifygs://BUCKET_NAME/FOLDER_PATH/
, including the trailing slash. - POSIX file system:
posix:///PATH
. This must be an absolute path from the root of the agent host machine.
- Cloud Storage:
If the transfer requires transfer agents, the following options are available:
--source-agent-pool
specifies the source agent pool to use for this transfer. Required for transfers originating from a file system.--destination-agent-pool
specifies the destination agent pool to use for this transfer. Required for transfers to a file system.--intermediate-storage-path
is the path to a Cloud Storage bucket, in the formgs://my-intermediary-bucket
. Required for transfers between two file systems. See Create a Cloud Storage bucket as an intermediary for details on creating the intermediate bucket.
Additional options include:
--source-creds-file
specifies the relative path to a local file on your machine that includes AWS or Azure credentials for the transfer source. For credential file formatting information, see the TransferSpec reference.--do-not-run
prevents Storage Transfer Service from running the job upon submission of the command. To run the job, update it to add a schedule, or usejobs run
to start it manually.--manifest-file
specifies the path to a CSV file in Cloud Storage containing a list of files to transfer from your source. For manifest file formatting, see Transfer specific files or objects using a manifest.Job information: You can specify
--name
,--description
, and--source-creds-file
.Schedule: Specify
--schedule-starts
,--schedule-repeats-every
, and--schedule-repeats-until
, or--do-not-run
.Object conditions: Use conditions to determine which objects are transferred. These include
--include-prefixes
and--exclude-prefixes
, and the time-based conditions in--include-modified-[before | after]-[absolute | relative]
. If you specified a folder with your source, prefix filters are relative to that folder. See Filter source objects by prefix for more information.Object conditions aren't supported for transfers involving file systems.
Transfer options: Specify whether to overwrite destination files (
--overwrite-when=different
oralways
) and whether to delete certain files during or after the transfer (--delete-from=destination-if-unique
orsource-after-transfer
); specify which metadata values to preserve (--preserve-metadata
); and optionally set a storage class on transferred objects (--custom-storage-class
).Notifications: Configure Pub/Sub notifications for transfers with
--notification-pubsub-topic
,--notification-event-types
, and--notification-payload-format
.Cloud Logging: Enable Cloud Logging for agentless transfers, or transfers from S3-compatible sources, with
--log-actions
and--log-action-states
. See Cloud Logging for Storage Transfer Service for details.
Transfers from S3-compatible sources also use the following options:
--source-endpoint
(required) specifies your storage system's endpoint. For example,s3.example.com
. Check with your provider for the correct formatting. Do not specify the protocol (http://
orhttps://
).--source-signing-region
specifies a region for signing requests. Omit this flag if your storage provider doesn't require a signing region.--source-auth-method
specifies the authentication method to use. Valid values areAWS_SIGNATURE_V2
orAWS_SIGNATURE_V4
. Refer to Amazon's SigV4 and SigV2 documentation for more information.--source-request-model
specifies the addressing style to use. Valid values arePATH_STYLE
orVIRTUAL_HOSTED_STYLE
. Path style uses the formathttps://s3.example.com/BUCKET_NAME/KEY_NAME
. Virtual hosted style uses the format `https://BUCKET_NAME.s3.example.com/KEY_NAME.--source-network-protocol
specifies the network protocol that agents should use for this job. Valid values areHTTP
orHTTPS
.--source-list-api
specifies the version of the S3 listing API for returning objects from the bucket. Valid values areLIST_OBJECTS
orLIST_OBJECTS_V2
. Refer to Amazon's ListObjectsV2 and ListObjects documentation for more information.
To view all options, run gcloud transfer jobs create --help
or refer to the
gcloud
reference documentation.
Examples
Amazon S3 to Cloud Storage
See Transfer from Amazon S3 to Cloud Storage.
S3-compatible storage to Cloud Storage
See Transfer from S3-compatible storage to Cloud Storage.
File system to Cloud Storage
See Transfer from a file system to Cloud Storage.
Cloud Storage to file system
To transfer from a Cloud Storage bucket to a file system, specify the following.
gcloud transfer jobs create \
gs://my-storage-bucket posix:///tmp/destination \
--destination-agent-pool=my-destination-agent-pool
File system to file system
To transfer between two file systems, you must specify a source agent pool, a destination agent pool, and an intermediate Cloud Storage bucket through which the data passes.
See Create a Cloud Storage bucket as an intermediary for details on the intermediate bucket.
Then, specify these 3 resources when calling transfer jobs create
:
gcloud transfer jobs create \
posix:///tmp/source/on/systemA posix:///tmp/destination/on/systemB \
--source-agent-pool=source_agent_pool \
--destination-agent-pool=destination_agent_pool \
--intermediate-storage-path=gs://my-intermediary-bucket
REST
The following samples show you how to use Storage Transfer Service through the REST API.
When you configure or edit transfer jobs using the Storage Transfer Service API, the time must be in UTC. For more information on specifying the schedule of a transfer job, see Schedule.
Transfer between Cloud Storage buckets
In this example, you'll learn how to move files from one Cloud Storage bucket to another. For example, you can move data to a bucket in another location.
Request using transferJobs create:
POST https://storagetransfer.googleapis.com/v1/transferJobs { "description": "YOUR DESCRIPTION", "status": "ENABLED", "projectId": "PROJECT_ID", "schedule": { "scheduleStartDate": { "day": 1, "month": 1, "year": 2015 }, "startTimeOfDay": { "hours": 1, "minutes": 1 } }, "transferSpec": { "gcsDataSource": { "bucketName": "GCS_SOURCE_NAME" }, "gcsDataSink": { "bucketName": "GCS_SINK_NAME" }, "transferOptions": { "deleteObjectsFromSourceAfterTransfer": true } } }
200 OK { "transferJob": [ { "creationTime": "2015-01-01T01:01:00.000000000Z", "description": "YOUR DESCRIPTION", "name": "transferJobs/JOB_ID", "status": "ENABLED", "lastModificationTime": "2015-01-01T01:01:00.000000000Z", "projectId": "PROJECT_ID", "schedule": { "scheduleStartDate": { "day": 1, "month": 1, "year": 2015 }, "startTimeOfDay": { "hours": 1, "minutes": 1 } }, "transferSpec": { "gcsDataSource": { "bucketName": "GCS_SOURCE_NAME", }, "gcsDataSink": { "bucketName": "GCS_NEARLINE_SINK_NAME" }, "objectConditions": { "minTimeElapsedSinceLastModification": "2592000.000s" }, "transferOptions": { "deleteObjectsFromSourceAfterTransfer": true } } } ] }
Transfer from Amazon S3 to Cloud Storage
See Transfer from Amazon S3 to Cloud Storage.
Transfer between Microsoft Azure Blob Storage and Cloud Storage
In this example, you'll learn how to move files from Microsoft Azure Storage to a Cloud Storage bucket, using a Microsoft Azure Storage shared access signature (SAS) token.
For more information on Microsoft Azure Storage SAS, see Grant limited access to Azure Storage resources using shared access signatures (SAS).
Before starting, review Configure access to Microsoft Azure Storage and Pricing to understand the implications of moving data from Microsoft Azure Storage to Cloud Storage.
Request using transferJobs create:
POST https://storagetransfer.googleapis.com/v1/transferJobs { "description": "YOUR DESCRIPTION", "status": "ENABLED", "projectId": "PROJECT_ID", "schedule": { "scheduleStartDate": { "day": 14, "month": 2, "year": 2020 }, "scheduleEndDate": { "day": 14 "month": 2, "year": 2020 }, "startTimeOfDay": { "hours": 1, "minutes": 1 } }, "transferSpec": { "azureBlobStorageDataSource": { "storageAccount": "AZURE_SOURCE_NAME", "azureCredentials": { "sasToken": "AZURE_SAS_TOKEN", }, "container": "AZURE_CONTAINER", }, "gcsDataSink": { "bucketName": "GCS_SINK_NAME" } } }
200 OK { "transferJob": [ { "creationTime": "2020-02-14T01:01:00.000000000Z", "description": "YOUR DESCRIPTION", "name": "transferJobs/JOB_ID", "status": "ENABLED", "lastModificationTime": "2020-02-14T01:01:00.000000000Z", "projectId": "PROJECT_ID", "schedule": { "scheduleStartDate": { "day": 14 "month": 2, "year": 2020 }, "scheduleEndDate": { "day": 14, "month": 2, "year": 2020 }, "startTimeOfDay": { "hours": 1, "minutes": 1 } }, "transferSpec": { "azureBlobStorageDataSource": { "storageAccount": "AZURE_SOURCE_NAME", "azureCredentials": { "sasToken": "AZURE_SAS_TOKEN", }, "container": "AZURE_CONTAINER", }, "objectConditions": {}, "transferOptions": {} } } ] }
Transfer from a file system
See Transfer from a file system to Cloud Storage.
Specifying source and destination paths
Source and destination paths enable you to specify source and destination
directories when transferring data to your Cloud Storage bucket. For
example, consider that you have files file1.txt
and file2.txt
and a
Cloud Storage bucket named B
. If you set a destination path named
my-stuff
, then after the transfer completes your files are located at
gs://B/my-stuff/file1.txt
and gs://B/my-stuff/file2.txt
.
Specifying a source path
To specify a source path when creating a transfer job, add a path
field to
the gcsDataSource
field in your
TransferSpec
specification:
{ gcsDataSource: { bucketName: "SOURCE_BUCKET", path: "SOURCE_PATH/", }, }
In this example:
- SOURCE_BUCKET: The source Cloud Storage bucket.
- SOURCE_PATH: The source Cloud Storage path.
Specifying a destination path
To specify a destination folder when you create a transfer job, add a path
field to the gcsDataSink
field in your
TransferSpec
specification:
{ gcsDataSink: { bucketName: "DESTINATION_BUCKET", path: "DESTINATION_PATH/", }, }
In this example:
- DESTINATION_BUCKET: The destination Cloud Storage bucket.
- DESTINATION_PATH: The destination Cloud Storage path.
Complete example request
The following is an example of a full request:
POST https://storagetransfer.googleapis.com/v1/transferJobs { "description": "YOUR DESCRIPTION", "status": "ENABLED", "projectId": "PROJECT_ID", "schedule": { "scheduleStartDate": { "day": 1, "month": 1, "year": 2015 }, "startTimeOfDay": { "hours": 1, "minutes": 1 } }, "transferSpec": { "gcsDataSource": { "bucketName": "GCS_SOURCE_NAME", "path": "GCS_SOURCE_PATH", }, "gcsDataSink": { "bucketName": "GCS_SINK_NAME", "path": "GCS_SINK_PATH", }, "objectConditions": { "minTimeElapsedSinceLastModification": "2592000s" }, "transferOptions": { "deleteObjectsFromSourceAfterTransfer": true } } }
Client libraries
The following samples show you how to use Storage Transfer Service programmatically with Go, Java, Node.js, and Python.
When you configure or edit transfer jobs programmatically, the time must be in UTC. For more information on specifying the schedule of a transfer job, see Schedule.
For more information about the Storage Transfer Service client libraries, see Getting started with Storage Transfer Service client libraries.
Transfer between Cloud Storage buckets
In this example, you'll learn how to move files from one Cloud Storage bucket to another. For example, you can move data to a bucket in another location.
Go
Java
Looking for older samples? See the Storage Transfer Service Migration Guide.
Node.js
Python
Looking for older samples? See the Storage Transfer Service Migration Guide.
Transfer from Amazon S3 to Cloud Storage
See Transfer from Amazon S3 to Cloud Storage.
Transfer between Microsoft Azure Blob Storage and Cloud Storage
In this example, you'll learn how to move files from Microsoft Azure Storage to a Cloud Storage bucket, using a Microsoft Azure Storage shared access signature (SAS) token.
For more information on Microsoft Azure Storage SAS, see Grant limited access to Azure Storage resources using shared access signatures (SAS).
Before starting, review Configure access to Microsoft Azure Storage and Pricing to understand the implications of moving data from Microsoft Azure Storage to Cloud Storage.
Go
To learn how to install and use the client library for Storage Transfer Service, see Storage Transfer Service client libraries. For more information, see the Storage Transfer Service Go API reference documentation.
To authenticate to Storage Transfer Service, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
To learn how to install and use the client library for Storage Transfer Service, see Storage Transfer Service client libraries. For more information, see the Storage Transfer Service Java API reference documentation.
To authenticate to Storage Transfer Service, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
To learn how to install and use the client library for Storage Transfer Service, see Storage Transfer Service client libraries. For more information, see the Storage Transfer Service Node.js API reference documentation.
To authenticate to Storage Transfer Service, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
To learn how to install and use the client library for Storage Transfer Service, see Storage Transfer Service client libraries. For more information, see the Storage Transfer Service Python API reference documentation.
To authenticate to Storage Transfer Service, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.