Configuring access to data sources and sinks

This page explains how to set up access to the data source and data sink for a data transfer using Storage Transfer Service.

Storage Transfer Service uses a Google-managed service account to move your data. If you create a transfer from Google Cloud Console and have permissions to update IAM policies for Cloud Storage resources, then transfers created from Google Cloud Console automatically grant the Google-managed service account used by Storage Transfer Service the required permissions for the transfer.

Access to non-Google Cloud data sources and transfers created using the Storage Transfer Service API requires additional setup.

Prerequisites

Service account permissions are granted at the bucket level. You must have the ability to grant these permissions, such as having the Storage Admin role. For more information, see Identity and Access Management.

If you plan to use Pub/Sub for transfers, then ensure that you grant the service account the IAM role Pub/Sub Publisher (roles/pubsub.publisher) for the desired Pub/Sub topic. There may be a delay of several seconds between assigning the role and having it applied to your service account. If you grant this permission programmatically, wait 30 seconds before configuring Storage Transfer Service.

If Cloud Key Management Service is enabled on your Cloud Storage source or destination buckets, check that the quotas listed for Cloud KMS in your project's Quotas page are compatible with Storage Transfer Service's Read quotas and Write quotas. If they aren't, request a quota increase from your project's Quotas page.

For more information, see the following:

Setting up access to the data source

Cloud Storage

Storage Transfer Service uses a Google-managed service account to move your data from a Cloud Storage source bucket, which is created the first time you call googleServiceAccounts.get.

The service account's format is typically project-PROJECT_NUMBER@storage-transfer-service.iam.gserviceaccount.com. To find your service account's format, use the googleServiceAccounts.get API call.

To set up Storage Transfer Service to use Cloud Storage as a data source assign the following roles, or equivalent permissions, to the Google-managed service account returned by the googleServiceAccounts.get API call:

Role What it does Notes
Storage Object Viewer (roles/storage.objectViewer) Enables the service account to read the bucket's contents, and read object data and metadata.
Storage Legacy Bucket Reader (roles/storage.legacyBucketReader) Enables the service account to read a bucket's contents and its metadata, and read object metadata. If you don't intend to delete source objects from Cloud Storage, assign Storage Legacy Bucket Reader (roles/storage.legacyBucketReader) to the service account.
Storage Legacy Bucket Writer (roles/storage.legacyBucketWriter) Enables the service account to create, overwrite, and delete objects; list objects in a bucket; read object metadata when listing; and read bucket metadata, excluding IAM policies. If you intend to delete source objects from Cloud Storage, assign Storage Legacy Bucket Writer (roles/storage.legacyBucketWriter) to the service account.

For advanced data transfers, see IAM permissions for Storage Transfer Service.

Amazon S3

Follow these steps to set up access to an Amazon S3 bucket:

  1. Setup access to the Amazon S3 bucket using one of the following methods:

    Access credentials

    1. Create an AWS Identity and Access Management (AWS IAM) user with a name that you can easily recognize, such as transfer-user. Ensure the name follows the AWS IAM user name guidelines (see Limitations on IAM Entities and Objects).
    2. Give the AWS IAM user the ability to do the following:
      • List the Amazon S3 bucket.
      • Get the location of the bucket.
      • Read the objects in the bucket.
      • If you plan to delete objects from the source after the objects are transferred, grant the user Delete objects permissions.
    3. Create at least one access/secret key pair for the transfer job that you plan to set up. You can also create a separate access/secret key pair for each transfer job.

    Federated identity

    1. Storage Transfer Service uses a Google-managed service account to move your data from a Amazon S3 source bucket source bucket. The service account's format is typically project-PROJECT_NUMBER@storage-transfer-service.iam.gserviceaccount.com. To find your service account's format and create the service account if it doesn't already exist, use the googleServiceAccounts.get API call. Note the service account for the following steps.
    2. To allow the service to authenticate outbound requests with the service account, add the Service Account Token Creator role to the service account you previously noted.
    3. Create the following Amazon Resource Name (ARN) IAM role with AssumeRoleWithWebIdentity permissions:
      {
        "Version": "2012-10-17",
        "Statement": [
          {
            "Effect": "Allow",
            "Principal": {
              "Federated": "accounts.google.com"
            },
            "Action": "sts:AssumeRoleWithWebIdentity",
            "Condition": {
              "StringEquals": {
                "accounts.google.com:sub": "Service_account_subject_identifier"
              }
            }
          }
        ]
      }
                        
      For more information about ARNs, see IAM ARNs.
    4. Add permissions that allow Storage Transfer Service to access Amazon S3 resources. To do so, attach the following policy to the ARN IAM role, which can be done through the AWS IAM Console:
      {
        "Version": "2012-10-17",
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
                "s3:Get*",
                "s3:List*",
                "s3:Delete*",
             ],
            "Resource": "*"
          }
        ]
      }
                        
  2. Restore any objects that are archived to Amazon Glacier. Objects in Amazon S3 that are archived to Amazon Glacier are not accessible until they are restored. For more information, see the Migrating to Cloud Storage From Amazon Glacier White Paper.

Microsoft Azure Blob Storage

Follow these steps to configure access to an Microsoft Azure Storage container:

  1. Create or use an existing Microsoft Azure Storage user to access the storage account for your Microsoft Azure Storage Blob container.
  2. Create an SAS token at the container level. See Grant limited access to Azure Storage resources using shared access signatures.

    The default expiration time for SAS tokens is 8 hours. When you create your SAS token, ensure to set a reasonable expiration time that enables you to successfully complete your transfer.

URL list

If your data source is a URL list, ensure that each object on the URL list is publicly accessible.

Setting up access to the data sink

Storage Transfer Service uses a Google-managed service account to move your data from a Cloud Storage source bucket, which is created the first time you call googleServiceAccounts.get.

The service account's format is typically project-PROJECT_NUMBER@storage-transfer-service.iam.gserviceaccount.com. To find your service account's format, use the googleServiceAccounts.get API call.

The data sink for your data transfer is always a Cloud Storage bucket.

To set up Storage Transfer Service to use Cloud Storage as a data sink assign the following roles, or equivalent permissions, to the Google-managed service account returned by the googleServiceAccounts.get API call:

Role What it does
Storage Legacy Bucket Writer (roles/storage.legacyBucketWriter) Enables the Google-managed service account to create, overwrite, and delete objects; list objects in the destination bucket; read bucket metadata.
Storage Object Viewer (roles/storage.objectViewer) Enables the Google-managed service account to list and get objects in the destination bucket.

For more information about required permissions, see IAM permissions for Storage Transfer Service.

What's next