TransferSpec

Configuration for running a transfer.

JSON representation
{
  "objectConditions": {
    object (ObjectConditions)
  },
  "transferOptions": {
    object (TransferOptions)
  },
  "transferManifest": {
    object (TransferManifest)
  },
  "sourceAgentPoolName": string,
  "sinkAgentPoolName": string,

  // Union field data_sink can be only one of the following:
  "gcsDataSink": {
    object (GcsData)
  },
  "posixDataSink": {
    object (PosixFilesystem)
  }
  // End of list of possible types for union field data_sink.

  // Union field data_source can be only one of the following:
  "gcsDataSource": {
    object (GcsData)
  },
  "awsS3DataSource": {
    object (AwsS3Data)
  },
  "httpDataSource": {
    object (HttpData)
  },
  "posixDataSource": {
    object (PosixFilesystem)
  },
  "azureBlobStorageDataSource": {
    object (AzureBlobStorageData)
  }
  // End of list of possible types for union field data_source.
  "gcsIntermediateDataLocation": {
    object (GcsData)
  }
}
Fields
objectConditions

object (ObjectConditions)

Only objects that satisfy these object conditions are included in the set of data source and data sink objects. Object conditions based on objects' "last modification time" do not exclude objects in a data sink.

transferOptions

object (TransferOptions)

If the option deleteObjectsUniqueInSink is true and time-based object conditions such as 'last modification time' are specified, the request fails with an INVALID_ARGUMENT error.

transferManifest

object (TransferManifest)

A manifest file provides a list of objects to be transferred from the data source. This field points to the location of the manifest file. Otherwise, the entire source bucket is used. ObjectConditions still apply.

sourceAgentPoolName

string

Specifies the agent pool name associated with the posix data source. When unspecified, the default name is used.

sinkAgentPoolName

string

Specifies the agent pool name associated with the posix data sink. When unspecified, the default name is used.

Union field data_sink. The write sink for the data. data_sink can be only one of the following:
gcsDataSink

object (GcsData)

A Cloud Storage data sink.

posixDataSink

object (PosixFilesystem)

A POSIX Filesystem data sink.

Union field data_source. The read source of the data. data_source can be only one of the following:
gcsDataSource

object (GcsData)

A Cloud Storage data source.

awsS3DataSource

object (AwsS3Data)

An AWS S3 data source.

httpDataSource

object (HttpData)

An HTTP URL data source.

posixDataSource

object (PosixFilesystem)

A POSIX Filesystem data source.

azureBlobStorageDataSource

object (AzureBlobStorageData)

An Azure Blob Storage data source.

gcsIntermediateDataLocation

object (GcsData)

Cloud Storage intermediate data location.

GcsData

In a GcsData resource, an object's name is the Cloud Storage object's name and its "last modification time" refers to the object's updated property of Cloud Storage objects, which changes when the content or the metadata of the object is updated.

JSON representation
{
  "bucketName": string,
  "path": string
}
Fields
bucketName

string

Required. Cloud Storage bucket name. Must meet Bucket Name Requirements.

path

string

Root path to transfer objects.

Must be an empty string or full path name that ends with a '/'. This field is treated as an object prefix. As such, it should generally not begin with a '/'.

The root path value must meet Object Name Requirements.

PosixFilesystem

A POSIX filesystem resource.

JSON representation
{
  "rootDirectory": string
}
Fields
rootDirectory

string

Root directory path to the filesystem.

AwsS3Data

An AwsS3Data resource can be a data source, but not a data sink. In an AwsS3Data resource, an object's name is the S3 object's key name.

JSON representation
{
  "bucketName": string,
  "awsAccessKey": {
    object (AwsAccessKey)
  },
  "path": string,
  "roleArn": string
}
Fields
bucketName

string

Required. S3 Bucket name (see Creating a bucket).

awsAccessKey

object (AwsAccessKey)

Input only. AWS access key used to sign the API requests to the AWS S3 bucket. Permissions on the bucket must be granted to the access ID of the AWS access key.

For information on our data retention policy for user credentials, see User credentials.

path

string

Root path to transfer objects.

Must be an empty string or full path name that ends with a '/'. This field is treated as an object prefix. As such, it should generally not begin with a '/'.

roleArn

string

The Amazon Resource Name (ARN) of the role to support temporary credentials via AssumeRoleWithWebIdentity. For more information about ARNs, see IAM ARNs.

When a role ARN is provided, Transfer Service fetches temporary credentials for the session using a AssumeRoleWithWebIdentity call for the provided role using the GoogleServiceAccount for this project.

AwsAccessKey

AWS access key (see AWS Security Credentials).

For information on our data retention policy for user credentials, see User credentials.

JSON representation
{
  "accessKeyId": string,
  "secretAccessKey": string
}
Fields
accessKeyId

string

Required. AWS access key ID.

secretAccessKey

string

Required. AWS secret access key. This field is not returned in RPC responses.

HttpData

An HttpData resource specifies a list of objects on the web to be transferred over HTTP. The information of the objects to be transferred is contained in a file referenced by a URL. The first line in the file must be "TsvHttpData-1.0", which specifies the format of the file. Subsequent lines specify the information of the list of objects, one object per list entry. Each entry has the following tab-delimited fields:

  • HTTP URL — The location of the object.

  • Length — The size of the object in bytes.

  • MD5 — The base64-encoded MD5 hash of the object.

For an example of a valid TSV file, see Transferring data from URLs.

When transferring data based on a URL list, keep the following in mind:

  • When an object located at http(s)://hostname:port/<URL-path> is transferred to a data sink, the name of the object at the data sink is <hostname>/<URL-path>.

  • If the specified size of an object does not match the actual size of the object fetched, the object is not transferred.

  • If the specified MD5 does not match the MD5 computed from the transferred bytes, the object transfer fails.

  • Ensure that each URL you specify is publicly accessible. For example, in Cloud Storage you can share an object publicly and get a link to it.

  • Storage Transfer Service obeys robots.txt rules and requires the source HTTP server to support Range requests and to return a Content-Length header in each response.

  • ObjectConditions have no effect when filtering objects to transfer.

JSON representation
{
  "listUrl": string
}
Fields
listUrl

string

Required. The URL that points to the file that stores the object list entries. This file must allow public access. Currently, only URLs with HTTP and HTTPS schemes are supported.

AzureBlobStorageData

An AzureBlobStorageData resource can be a data source, but not a data sink. An AzureBlobStorageData resource represents one Azure container. The storage account determines the Azure endpoint. In an AzureBlobStorageData resource, a blobs's name is the Azure Blob Storage blob's key name.

JSON representation
{
  "storageAccount": string,
  "azureCredentials": {
    object (AzureCredentials)
  },
  "container": string,
  "path": string
}
Fields
storageAccount

string

Required. The name of the Azure Storage account.

azureCredentials

object (AzureCredentials)

Required. Input only. Credentials used to authenticate API requests to Azure.

For information on our data retention policy for user credentials, see User credentials.

container

string

Required. The container to transfer from the Azure Storage account.

path

string

Root path to transfer objects.

Must be an empty string or full path name that ends with a '/'. This field is treated as an object prefix. As such, it should generally not begin with a '/'.

AzureCredentials

Azure credentials

For information on our data retention policy for user credentials, see User credentials.

JSON representation
{
  "sasToken": string
}
Fields
sasToken

string

Required. Azure shared access signature (SAS).

For more information about SAS, see Grant limited access to Azure Storage resources using shared access signatures (SAS).

ObjectConditions

Conditions that determine which objects are transferred. Applies only to Cloud Data Sources such as S3, Azure, and Cloud Storage.

The "last modification time" refers to the time of the last change to the object's content or metadata — specifically, this is the updated property of Cloud Storage objects, the LastModified field of S3 objects, and the Last-Modified header of Azure blobs.

Transfers with a PosixFilesystem source or destination don't support ObjectConditions.

JSON representation
{
  "minTimeElapsedSinceLastModification": string,
  "maxTimeElapsedSinceLastModification": string,
  "includePrefixes": [
    string
  ],
  "excludePrefixes": [
    string
  ],
  "lastModifiedSince": string,
  "lastModifiedBefore": string
}
Fields
minTimeElapsedSinceLastModification

string (Duration format)

Ensures that objects are not transferred until a specific minimum time has elapsed after the "last modification time". When a TransferOperation begins, objects with a "last modification time" are transferred only if the elapsed time between the startTime of the TransferOperation and the "last modification time" of the object is equal to or greater than the value of minTimeElapsedSinceLastModification`. Objects that do not have a "last modification time" are also transferred.

A duration in seconds with up to nine fractional digits, terminated by 's'. Example: "3.5s".

maxTimeElapsedSinceLastModification

string (Duration format)

Ensures that objects are not transferred if a specific maximum time has elapsed since the "last modification time". When a TransferOperation begins, objects with a "last modification time" are transferred only if the elapsed time between the startTime of the TransferOperationand the "last modification time" of the object is less than the value of maxTimeElapsedSinceLastModification`. Objects that do not have a "last modification time" are also transferred.

A duration in seconds with up to nine fractional digits, terminated by 's'. Example: "3.5s".

includePrefixes[]

string

If you specify includePrefixes, Storage Transfer Service uses the items in the includePrefixes array to determine which objects to include in a transfer. Objects must start with one of the matching includePrefixes for inclusion in the transfer. If excludePrefixes is specified, objects must not start with any of the excludePrefixes specified for inclusion in the transfer.

The following are requirements of includePrefixes:

  • Each include-prefix can contain any sequence of Unicode characters, to a max length of 1024 bytes when UTF8-encoded, and must not contain Carriage Return or Line Feed characters. Wildcard matching and regular expression matching are not supported.

  • Each include-prefix must omit the leading slash. For example, to include the object s3://my-aws-bucket/logs/y=2015/requests.gz, specify the include-prefix as logs/y=2015/requests.gz.

  • None of the include-prefix values can be empty, if specified.

  • Each include-prefix must include a distinct portion of the object namespace. No include-prefix may be a prefix of another include-prefix.

The max size of includePrefixes is 1000.

For more information, see Filtering objects from transfers.

excludePrefixes[]

string

If you specify excludePrefixes, Storage Transfer Service uses the items in the excludePrefixes array to determine which objects to exclude from a transfer. Objects must not start with one of the matching excludePrefixes for inclusion in a transfer.

The following are requirements of excludePrefixes:

  • Each exclude-prefix can contain any sequence of Unicode characters, to a max length of 1024 bytes when UTF8-encoded, and must not contain Carriage Return or Line Feed characters. Wildcard matching and regular expression matching are not supported.

  • Each exclude-prefix must omit the leading slash. For example, to exclude the object s3://my-aws-bucket/logs/y=2015/requests.gz, specify the exclude-prefix as logs/y=2015/requests.gz.

  • None of the exclude-prefix values can be empty, if specified.

  • Each exclude-prefix must exclude a distinct portion of the object namespace. No exclude-prefix may be a prefix of another exclude-prefix.

  • If includePrefixes is specified, then each exclude-prefix must start with the value of a path explicitly included by includePrefixes.

The max size of excludePrefixes is 1000.

For more information, see Filtering objects from transfers.

lastModifiedSince

string (Timestamp format)

If specified, only objects with a "last modification time" on or after this timestamp and objects that don't have a "last modification time" are transferred.

The lastModifiedSince and lastModifiedBefore fields can be used together for chunked data processing. For example, consider a script that processes each day's worth of data at a time. For that you'd set each of the fields as follows:

  • lastModifiedSince to the start of the day

  • lastModifiedBefore to the end of the day

A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z" and "2014-10-02T15:01:23.045123456Z".

lastModifiedBefore

string (Timestamp format)

If specified, only objects with a "last modification time" before this timestamp and objects that don't have a "last modification time" are transferred.

A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z" and "2014-10-02T15:01:23.045123456Z".

TransferOptions

TransferOptions define the actions to be performed on objects in a transfer.

JSON representation
{
  "overwriteObjectsAlreadyExistingInSink": boolean,
  "deleteObjectsUniqueInSink": boolean,
  "deleteObjectsFromSourceAfterTransfer": boolean,
  "overwriteWhen": enum (OverwriteWhen),
  "metadataOptions": {
    object (MetadataOptions)
  }
}
Fields
overwriteObjectsAlreadyExistingInSink

boolean

When to overwrite objects that already exist in the sink. The default is that only objects that are different from the source are ovewritten. If true, all objects in the sink whose name matches an object in the source are overwritten with the source object.

deleteObjectsUniqueInSink

boolean

Whether objects that exist only in the sink should be deleted.

Note: This option and deleteObjectsFromSourceAfterTransfer are mutually exclusive.

deleteObjectsFromSourceAfterTransfer

boolean

Whether objects should be deleted from the source after they are transferred to the sink.

Note: This option and deleteObjectsUniqueInSink are mutually exclusive.

overwriteWhen

enum (OverwriteWhen)

When to overwrite objects that already exist in the sink. If not set, overwrite behavior is determined by overwriteObjectsAlreadyExistingInSink.

metadataOptions

object (MetadataOptions)

Represents the selected metadata options for a transfer job. This feature is in Preview.

OverwriteWhen

Specifies when to overwrite an object in the sink when an object with matching name is found in the source.

Enums
OVERWRITE_WHEN_UNSPECIFIED Do not use. Indicates that the option is not set.
DIFFERENT Overwrites destination objects with the source objects, only if the objects have the same name but different HTTP ETags or checksum values.
NEVER Never overwrites a destination object if a source object has the same name. In this case, the source object is not transferred.
ALWAYS Always overwrite destination objects with source objects having the same name, even if they have different HTTP ETags or checksum values.

MetadataOptions

Specifies the metadata options for running a transfer.

JSON representation
{
  "symlink": enum (Symlink),
  "mode": enum (Mode),
  "gid": enum (GID),
  "uid": enum (UID),
  "acl": enum (Acl),
  "storageClass": enum (StorageClass),
  "temporaryHold": enum (TemporaryHold),
  "kmsKey": enum (KmsKey),
  "timeCreated": enum (TimeCreated)
}
Fields
mode

enum (Mode)

Specifies how each file's mode attribute should be handled by the transfer. By default, mode is not preserved. Only applicable to transfers involving POSIX file systems, and ignored for other transfers.

gid

enum (GID)

Specifies how each file's POSIX group ID (GID) attribute should be handled by the transfer. By default, GID is not preserved. Only applicable to transfers involving POSIX file systems, and ignored for other transfers.

uid

enum (UID)

Specifies how each file's POSIX user ID (UID) attribute should be handled by the transfer. By default, UID is not preserved. Only applicable to transfers involving POSIX file systems, and ignored for other transfers.

acl

enum (Acl)

Specifies how each object's ACLs should be preserved for transfers between Google Cloud Storage buckets. If unspecified, the default behavior is the same as ACL_DESTINATION_BUCKET_DEFAULT.

storageClass

enum (StorageClass)

Specifies the storage class to set on objects being transferred to Google Cloud Storage buckets. If unspecified, the default behavior is the same as STORAGE_CLASS_DESTINATION_BUCKET_DEFAULT.

temporaryHold

enum (TemporaryHold)

Specifies how each object's temporary hold status should be preserved for transfers between Google Cloud Storage buckets. If unspecified, the default behavior is the same as TEMPORARY_HOLD_PRESERVE.

kmsKey

enum (KmsKey)

Specifies how each object's Cloud KMS customer-managed encryption key (CMEK) is preserved for transfers between Google Cloud Storage buckets. If unspecified, the default behavior is the same as KMS_KEY_DESTINATION_BUCKET_DEFAULT.

timeCreated

enum (TimeCreated)

Specifies how each object's timeCreated metadata is preserved for transfers between Google Cloud Storage buckets. If unspecified, the default behavior is the same as TIME_CREATED_SKIP.

Mode

Options for handling file mode attribute.

Enums
MODE_UNSPECIFIED Mode behavior is unspecified.
MODE_SKIP Do not preserve mode during a transfer job.
MODE_PRESERVE Preserve mode during a transfer job.

GID

Options for handling file GID attribute.

Enums
GID_UNSPECIFIED GID behavior is unspecified.
GID_SKIP Do not preserve GID during a transfer job.
GID_NUMBER Preserve GID during a transfer job.

UID

Options for handling file UID attribute.

Enums
UID_UNSPECIFIED UID behavior is unspecified.
UID_SKIP Do not preserve UID during a transfer job.
UID_NUMBER Preserve UID during a transfer job.

Acl

Options for handling Cloud Storage object ACLs.

Enums
ACL_UNSPECIFIED ACL behavior is unspecified.
ACL_DESTINATION_BUCKET_DEFAULT Use the destination bucket's default object ACLS, if applicable.
ACL_PRESERVE Preserve the object's original ACLs. This requires the service account to have storage.objects.getIamPolicy permission for the source object. Uniform bucket-level access must not be enabled on either the source or destination buckets.

StorageClass

Options for handling Google Cloud Storage object storage class.

Enums
STORAGE_CLASS_UNSPECIFIED Storage class behavior is unspecified.
STORAGE_CLASS_DESTINATION_BUCKET_DEFAULT Use the destination bucket's default storage class.
STORAGE_CLASS_PRESERVE Preserve the object's original storage class. This is only supported for transfers from Google Cloud Storage buckets.
STORAGE_CLASS_STANDARD Set the storage class to STANDARD.
STORAGE_CLASS_NEARLINE Set the storage class to NEARLINE.
STORAGE_CLASS_COLDLINE Set the storage class to COLDLINE.
STORAGE_CLASS_ARCHIVE Set the storage class to ARCHIVE.

TemporaryHold

Options for handling temporary holds for Google Cloud Storage objects.

Enums
TEMPORARY_HOLD_UNSPECIFIED Temporary hold behavior is unspecified.
TEMPORARY_HOLD_SKIP Do not set a temporary hold on the destination object.
TEMPORARY_HOLD_PRESERVE Preserve the object's original temporary hold status.

KmsKey

Options for handling the KmsKey setting for Google Cloud Storage objects.

Enums
KMS_KEY_UNSPECIFIED KmsKey behavior is unspecified.
KMS_KEY_DESTINATION_BUCKET_DEFAULT Use the destination bucket's default encryption settings.
KMS_KEY_PRESERVE Preserve the object's original Cloud KMS customer-managed encryption key (CMEK) if present. Objects that do not use a Cloud KMS encryption key will be encrypted using the destination bucket's encryption settings.

TimeCreated

Options for handling timeCreated metadata for Google Cloud Storage objects.

Enums
TIME_CREATED_UNSPECIFIED TimeCreated behavior is unspecified.
TIME_CREATED_SKIP Do not preserve the timeCreated metadata from the source object.
TIME_CREATED_PRESERVE_AS_CUSTOM_TIME Preserves the source object's timeCreated metadata in the customTime field in the destination object. Note that any value stored in the source object's customTime field will not be propagated to the destination object.

TransferManifest

Specifies where the manifest is located.

JSON representation
{
  "location": string
}
Fields
location

string

Specifies the path to the manifest in Cloud Storage. The Google-managed service account for the transfer must have storage.objects.get permission for this object. An example path is gs://bucketName/path/manifest.csv.