TransferSpec

Configuration for running a transfer.

JSON representation
{
  "objectConditions": {
    object (ObjectConditions)
  },
  "transferOptions": {
    object (TransferOptions)
  },

  // Union field data_source can be only one of the following:
  "gcsDataSource": {
    object (GcsData)
  },
  "awsS3DataSource": {
    object (AwsS3Data)
  },
  "httpDataSource": {
    object (HttpData)
  },
  "azureBlobStorageDataSource": {
    object (AzureBlobStorageData)
  }
  // End of list of possible types for union field data_source.
  "gcsDataSink": {
    object (GcsData)
  }
}
Fields
objectConditions

object (ObjectConditions)

Only objects that satisfy these object conditions are included in the set of data source and data sink objects. Object conditions based on objects' "last modification time" do not exclude objects in a data sink.

transferOptions

object (TransferOptions)

If the option deleteObjectsUniqueInSink is true, object conditions based on objects' "last modification time" are ignored and do not exclude objects in a data source or a data sink.

Union field data_source. Required. The read source of the data. data_source can be only one of the following:
gcsDataSource

object (GcsData)

A Cloud Storage data source.

awsS3DataSource

object (AwsS3Data)

An AWS S3 data source.

httpDataSource

object (HttpData)

An HTTP URL data source.

azureBlobStorageDataSource

object (AzureBlobStorageData)

An Azure Blob Storage data source.

gcsDataSink

object (GcsData)

A Cloud Storage data sink.

GcsData

In a GcsData resource, an object's name is the Cloud Storage object's name and its "last modification time" refers to the object's updated property of Cloud Storage objects, which changes when the content or the metadata of the object is updated.

JSON representation
{
  "bucketName": string
}
Fields
bucketName

string

Required. Cloud Storage bucket name (see Bucket Name Requirements).

AwsS3Data

An AwsS3Data resource can be a data source, but not a data sink. In an AwsS3Data resource, an object's name is the S3 object's key name.

JSON representation
{
  "bucketName": string,
  "awsAccessKey": {
    object (AwsAccessKey)
  }
}
Fields
bucketName

string

Required. S3 Bucket name (see Creating a bucket).

awsAccessKey

object (AwsAccessKey)

Required. AWS access key used to sign the API requests to the AWS S3 bucket. Permissions on the bucket must be granted to the access ID of the AWS access key.

AwsAccessKey

AWS access key (see AWS Security Credentials).

JSON representation
{
  "accessKeyId": string,
  "secretAccessKey": string
}
Fields
accessKeyId

string

Required. AWS access key ID.

secretAccessKey

string

Required. AWS secret access key. This field is not returned in RPC responses.

HttpData

An HttpData resource specifies a list of objects on the web to be transferred over HTTP. The information of the objects to be transferred is contained in a file referenced by a URL. The first line in the file must be "TsvHttpData-1.0", which specifies the format of the file. Subsequent lines specify the information of the list of objects, one object per list entry. Each entry has the following tab-delimited fields:

  • HTTP URL — The location of the object.

  • Length — The size of the object in bytes.

  • MD5 — The base64-encoded MD5 hash of the object.

For an example of a valid TSV file, see Transferring data from URLs.

When transferring data based on a URL list, keep the following in mind:

  • When an object located at http(s)://hostname:port/<URL-path> is transferred to a data sink, the name of the object at the data sink is <hostname>/<URL-path>.

  • If the specified size of an object does not match the actual size of the object fetched, the object will not be transferred.

  • If the specified MD5 does not match the MD5 computed from the transferred bytes, the object transfer will fail. For more information, see Generating MD5 hashes

  • Ensure that each URL you specify is publicly accessible. For example, in Cloud Storage you can share an object publicly and get a link to it.

  • Storage Transfer Service obeys robots.txt rules and requires the source HTTP server to support Range requests and to return a Content-Length header in each response.

  • ObjectConditions have no effect when filtering objects to transfer.

JSON representation
{
  "listUrl": string
}
Fields
listUrl

string

Required. The URL that points to the file that stores the object list entries. This file must allow public access. Currently, only URLs with HTTP and HTTPS schemes are supported.

AzureBlobStorageData

An AzureBlobStorageData resource can be a data source, but not a data sink. An AzureBlobStorageData resource represents one Azure container. The storage account determines the Azure endpoint. In an AzureBlobStorageData resource, a blobs's name is the Azure Blob Storage blob's key name.

JSON representation
{
  "storageAccount": string,
  "azureCredentials": {
    object (AzureCredentials)
  },
  "container": string
}
Fields
storageAccount

string

Required. The name of the Azure Storage account.

azureCredentials

object (AzureCredentials)

Required. Credentials used to authenticate API requests to Azure.

container

string

Required. The container to transfer from the Azure Storage account.

AzureCredentials

Azure credentials

JSON representation
{
  "sasToken": string
}
Fields
sasToken

string

Required. Azure shared access signature. (see Grant limited access to Azure Storage resources using shared access signatures (SAS)).

ObjectConditions

Conditions that determine which objects will be transferred. Applies only to S3 and Cloud Storage objects.

The "last modification time" refers to the time of the last change to the object's content or metadata — specifically, this is the updated property of Cloud Storage objects and the LastModified field of S3 objects.

JSON representation
{
  "minTimeElapsedSinceLastModification": string,
  "maxTimeElapsedSinceLastModification": string,
  "includePrefixes": [
    string
  ],
  "excludePrefixes": [
    string
  ],
  "lastModifiedSince": string,
  "lastModifiedBefore": string
}
Fields
minTimeElapsedSinceLastModification

string (Duration format)

If specified, only objects with a "last modification time" before NOW - minTimeElapsedSinceLastModification and objects that don't have a "last modification time" are transferred.

For each TransferOperation started by this TransferJob, NOW refers to the startTime of the TransferOperation.

A duration in seconds with up to nine fractional digits, terminated by 's'. Example: "3.5s".

maxTimeElapsedSinceLastModification

string (Duration format)

If specified, only objects with a "last modification time" on or after NOW - maxTimeElapsedSinceLastModification and objects that don't have a "last modification time" are transferred.

For each TransferOperation started by this TransferJob, NOW refers to the startTime of the TransferOperation.

A duration in seconds with up to nine fractional digits, terminated by 's'. Example: "3.5s".

includePrefixes[]

string

If includePrefixes is specified, objects that satisfy the object conditions must have names that start with one of the includePrefixes and that do not start with any of the excludePrefixes. If includePrefixes is not specified, all objects except those that have names starting with one of the excludePrefixes must satisfy the object conditions.

Requirements:

  • Each include-prefix and exclude-prefix can contain any sequence of Unicode characters, to a max length of 1024 bytes when UTF8-encoded, and must not contain Carriage Return or Line Feed characters. Wildcard matching and regular expression matching are not supported.

  • Each include-prefix and exclude-prefix must omit the leading slash. For example, to include the requests.gz object in a transfer from s3://my-aws-bucket/logs/y=2015/requests.gz, specify the include prefix as logs/y=2015/requests.gz.

  • None of the include-prefix or the exclude-prefix values can be empty, if specified.

  • Each include-prefix must include a distinct portion of the object namespace. No include-prefix may be a prefix of another include-prefix.

  • Each exclude-prefix must exclude a distinct portion of the object namespace. No exclude-prefix may be a prefix of another exclude-prefix.

  • If includePrefixes is specified, then each exclude-prefix must start with the value of a path explicitly included by includePrefixes.

The max size of includePrefixes is 1000.

excludePrefixes[]

string

excludePrefixes must follow the requirements described for includePrefixes.

The max size of excludePrefixes is 1000.

lastModifiedSince

string (Timestamp format)

If specified, only objects with a "last modification time" on or after this timestamp and objects that don't have a "last modification time" are transferred.

The lastModifiedSince and lastModifiedBefore fields can be used together for chunked data processing. For example, consider a script that processes each day's worth of data at a time. For that you'd set each of the fields as follows:

  • lastModifiedSince to the start of the day

  • lastModifiedBefore to the end of the day

A timestamp in RFC3339 UTC "Zulu" format, accurate to nanoseconds. Example: "2014-10-02T15:01:23.045123456Z".

lastModifiedBefore

string (Timestamp format)

If specified, only objects with a "last modification time" before this timestamp and objects that don't have a "last modification time" will be transferred.

A timestamp in RFC3339 UTC "Zulu" format, accurate to nanoseconds. Example: "2014-10-02T15:01:23.045123456Z".

TransferOptions

TransferOptions uses three boolean parameters to define the actions to be performed on objects in a transfer.

JSON representation
{
  "overwriteObjectsAlreadyExistingInSink": boolean,
  "deleteObjectsUniqueInSink": boolean,
  "deleteObjectsFromSourceAfterTransfer": boolean
}
Fields
overwriteObjectsAlreadyExistingInSink

boolean

Whether overwriting objects that already exist in the sink is allowed.

deleteObjectsUniqueInSink

boolean

Whether objects that exist only in the sink should be deleted.

Note: This option and deleteObjectsFromSourceAfterTransfer are mutually exclusive.

deleteObjectsFromSourceAfterTransfer

boolean

Whether objects should be deleted from the source after they are transferred to the sink.

Note: This option and deleteObjectsUniqueInSink are mutually exclusive.