Storage Transfer Service supports the transfer of specific files or objects, which are specified using a manifest. A manifest is a CSV file, uploaded to Cloud Storage, that contains a list of files or objects for Storage Transfer Service to act upon.
A manifest can be used for the following transfers:
From AWS S3, Azure Blobstore, or Cloud Storage to a Cloud Storage bucket.
From a file system to a Cloud Storage bucket.
From S3-compatible storage to a Cloud Storage bucket.
From a Cloud Storage bucket to a file system.
Between two file systems.
From a publicly-accessible HTTP/HTTPS source to a Cloud Storage bucket. Follow the instructions in Create a URL list as the manifest format is unique for URL lists.
Create a manifest
Manifests must be formatted as CSV and can contain any UTF-8 characters. The first column must be a file name or object name specified as a string.
Manifest files do not support wildcards. The value must be a specific file or object name. Folder names without a file or object name are not supported.
The maximum manifest file size is 1 GiB, which translates to approximately 1 million rows. If you need to transfer a manifest file that is larger than 1 GiB, you can split it into multiple files and run multiple transfer jobs.
If a file or object name contains commas, the name must be enclosed in
double-quotes, according to
CSV standards.
For example, "object1,a.txt"
.
We recommend testing your transfer with a small subset of files or objects to avoid unnecessary API calls due to configuration errors.
You can monitor the status of file transfers from the Transfer Jobs page. Files or objects that fail to transfer are listed in the transfer logs.
File system transfers
To create a manifest of files on a file system, create a CSV file with a single column containing the file paths relative to the root directory specified in the transfer job creation.
For example, you may wish to transfer the following file system files:
File path |
---|
rootdir/dir1/subdir1/file1.txt |
rootdir/file2.txt |
rootdir/dir2/subdir1/file3.txt |
Your manifest should look like the following example:
dir1/subdir1/file1.txt
file2.txt
dir2/subdir1/file3.txt
Object storage transfers
To create a manifest of objects, create a CSV file whose first column contains the object names relative to the bucket name and path specified in the transfer job creation. All objects must be in the same bucket.
You can also specify an optional second column with the Cloud Storage generation number of the specific version to transfer.
For example, you may wish to transfer the following objects:
Object path | Cloud Storage generation number |
---|---|
SOURCE_PATH/object1.pdf |
1664826685911832 |
SOURCE_PATH/object2.pdf |
|
SOURCE_PATH/object3.pdf |
1664826610699837 |
Your manifest should look like the following example:
object1.pdf,1664826685911832
object2.pdf
object3.pdf,1664826610699837
Save the manifest file with any filename, and a .csv
extension.
HTTP/HTTPS transfers
To transfer specific files from an HTTP or HTTPS source, refer to the instructions in Create a URL list.
Publish the manifest
Once you've created the manifest, you must make it available to Storage Transfer Service. Storage Transfer Service can access the file in a Cloud Storage bucket, or on your file system.
Upload the manifest to Cloud Storage
You can store the manifest file in any Cloud Storage bucket.
The service agent running the
transfer must have storage.objects.get
permission for the bucket containing
the manifest. See
Grant the required permissions
for instructions on finding the service agent ID, and granting permissions to
that service agent on a bucket.
For instructions on uploading the manifest to a bucket, see Upload objects in the Cloud Storage documentation.
For example, to use the gcloud
CLI to upload a file to Cloud Storage,
use the gcloud storage cp
command:
gcloud storage cp MANIFEST.CSV gs://DESTINATION_BUCKET_NAME/
Where:
MANIFEST.CSV
is the local path to your manifest file. For example,Desktop/manifest01.csv
.DESTINATION_BUCKET_NAME
is the name of the bucket to which you are uploading your object. For example,my-bucket
.
If successful, the response looks like the following example:
Completed files 1/1 | 164.3kiB/164.3kiB
You can encrypt a manifest using customer-managed Cloud KMS encryption keys. In this case, ensure that any service accounts accessing the manifest are assigned the applicable encryption keys. Customer-supplied keys are not supported.
Store the manifest on a file system
You can store the manifest file on your source or destination file system.
The location of the file must be accessible to the transfer agents. If you restrict directory access for your agents, make sure the manifest file is located within a mounted directory.
Start a transfer
Do not modify the manifest file until a transfer operation completes. We recommend that you lock the manifest file when a transfer is taking place.
Cloud console
To start a transfer with a manifest from the Cloud console:
Follow the instructions in Create transfers to select your source, destination, and options.
In the final step, Choose settings, select the checkbox named Provide list of files to transfer via manifest file.
Enter the manifest file location.
gcloud
To transfer the files or objects that are listed in the manifest, include the
--manifest-file=MANIFEST_FILE
flag with your
gcloud transfer jobs create
command.
gcloud transfer jobs create SOURCE DESTINATION \
--manifest-file=MANIFEST_FILE
MANIFEST_FILE can be any of the following values:
The path to the CSV file in a Cloud Storage bucket:
--manifest-file=gs://my_bucket/sample_manifest.csv
See Upload the manifest to Cloud Storage for details on required permissions, if the bucket or file is not public.
The relative path from the file system SOURCE, including any path that was specified:
--manifest-file=source://relative_path/sample_manifest.csv
The relative path from the file system DESTINATION, including any path that was specified:
--manifest-file=destination://relative_path/sample_manifest.csv
REST + Client libraries
REST
To transfer the files or objects that are listed in the manifest, make
a createTransferJob
API call that specifies a
transferSpec
with the
transferManifest
field added. For example:
POST https://storagetransfer.googleapis.com/v1/transferJobs ... "transferSpec": { "posixDataSource": { "rootDirectory": "/home/", }, "gcsDataSink": { "bucketName": "GCS_NEARLINE_SINK_NAME", "path": "GCS_SINK_PATH", }, "transferManifest": { "location": "gs://my_bucket/sample_manifest.csv" } }
The manifest file can be stored in a Cloud Storage bucket, or on the
source or destination file system. Cloud Storage buckets must use the
gs://
prefix and include the full path, including the bucket
name. File system locations must use a source://
or
destination://
prefix and are relative to the file system source
or destination, and optional root directory.
Go
Java
Node.js
Python
The objects or files in the manifest aren't necessarily transferred in the listed order.
If the manifest includes files that already exist in the destination, those files are skipped unless the overwrite objects already existing in sink option is specified.
If the manifest includes objects that exist in a different version in the destination, the object in the destination is overwritten with the source version of the object. If the destination is a versioned bucket, a new version of the object is created.
What's next
- Learn how to filter objects from transfers.
- Learn how to schedule transfer jobs.