Transferring data from Amazon S3 to Cloud Storage using VPC Service Controls and Storage Transfer Service


This tutorial describes how to harden data transfers from Amazon Simple Storage Service (Amazon S3) to Cloud Storage using Storage Transfer Service with a VPC Service Controls perimeter. This tutorial is intended for data owners who have data that resides in Amazon S3, and who want to process or migrate that data securely to Google Cloud.

This tutorial assumes that you're familiar with Amazon Web Services (AWS) and the fundamentals of working with data in object stores. This tutorial applies a service account-based method of controlling access by using Access Context Manager. For more advanced access levels beyond the service account-based method, see Creating an access level.

Architecture

The following diagram outlines the VPC Service Controls architecture.

Architecture of VPC Service Controls where communication between Google Cloud services is denied outside of the controlled perimeter.

In the preceding diagram, VPC Service Controls explicitly denies communication between Google Cloud services unless both projects are in the controlled perimeter.

Objectives

  • Configure AWS access.
  • Create VPC Service Controls perimeter.
  • Create an access policy and access level by using Access Context Manager.
  • Use Storage Transfer Service to move data between Amazon S3 and Cloud Storage.
  • Schedule Storage Transfer Service to retrieve data on a schedule.

Costs

In this document, you use the following billable components of Google Cloud:

There are no extra costs to use Storage Transfer Service; however, Cloud Storage pricing and external provider costs apply when using Storage Transfer Service.

To generate a cost estimate based on your projected usage, use the pricing calculator. New Google Cloud users might be eligible for a free trial.

When you finish the tasks that are described in this document, you can avoid continued billing by deleting the resources that you created. For more information, see Clean up.

In addition to Google Cloud resources, this tutorial uses the following Amazon Web Services (AWS) resources, which might have costs:

Before you begin

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Make sure that billing is enabled for your Google Cloud project.

  4. Enable the Access Context Manager, Cloud Storage, and Storage Transfer Service APIs.

    Enable the APIs

  5. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  6. Make sure that billing is enabled for your Google Cloud project.

  7. Enable the Access Context Manager, Cloud Storage, and Storage Transfer Service APIs.

    Enable the APIs

  8. In the Google Cloud console, activate Cloud Shell.

    Activate Cloud Shell

    At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

  9. In the Google Cloud console, go to the IAM and Admin page to give your account the role of Storage Admin and Access Context Manager Admin.
    Go to the IAM and Admin page
  10. The Storage Admin role has the following permissions:

    • firebase.projects.get
    • resourcemanager.projects.get
    • resourcemanager.projects.list
    • storage.buckets.*
    • storage.objects.*

    The Access Context Manager Admin role has the following permissions:

    • accesscontextmanager.accessLevels.*
    • accesscontextmanager.accessPolicies.*
    • accesscontextmanager.accessPolicies.setIamPolicy
    • accesscontextmanager.accessPolicies.update
    • accesscontextmanager.accessZones.*
    • accesscontextmanager.policies.*
    • accesscontextmanager.servicePerimeters.*
    • resourcemanager.organizations.get

Configuring AWS access

In this tutorial, you work with existing AWS Identity and Access Management (AWS IAM) users to create an AWS IAM policy to interface with StorageTransfer Service. These policies and users are needed to authenticate your connection to Google Cloud and to help secure your data in transit. This tutorial requires that you have an Amazon S3 bucket to transfer data from; you can use an existing Amazon S3 bucket or you can create a new bucket. You can use a test or sandbox AWS account to avoid affecting production resources in the same account.

Create an AWS IAM policy for Storage Transfer Service and apply it to your bucket

  1. In the AWS Management Console, go to the IAM page.
  2. Click Policies, and then click Create Policy.
  3. In the visual editor, click IAM Policy.
  4. Click S3.
  5. Select the following Access Level checkboxes:

    • List
    • Read
    • Write
  6. In the Resources pane, click Specific.

  7. In the Bucket pane, click Add ARN.

  8. In the Bucket Name field, enter the name of the bucket where you're transferring data from.

  9. Click Review Policy and enter a name such as transfer-user-policy.

  10. Click Create Policy.

Add AWS IAM users to your AWS IAM policy

  1. In the AWS Management Console, go to the IAM page.
  2. Click Users, and then click Add User.
  3. In the Name field, enter transfer-user.
  4. For Access Type, click Programmatic Access and attach the transfer-user-policy that you created for that user.
  5. After you create the user, make a note of your access ID and secret key pair because it's used later in the tutorial.
  6. Click Save.

Creating a Cloud Storage bucket

Before you can enable your VPC Service Controls perimeter, you need to create a Cloud Storage bucket.

  1. In the Google Cloud console, go to the Cloud Storage Browser.

    Go to the Cloud Storage Browser page

  2. Click Create bucket.

  3. In the Name field, enter a name, such as project-id-destination-bucket where project-id represents your Google Cloud project ID.

  4. For the Default storage class for the bucket, click Regional storage.

  5. In the Location drop-down list, click a region where the bucket data is stored.

  6. Click Create.

Finding the name of your transfer operations service account

Storage Transfer Service uses a Google-managed service account to communicate with Cloud Storage and Pub/Sub resources within your project. You need to determine the name of your service account because it is used later in this tutorial. If you haven't used Storage Transfer Service before, the following creates the Storage Transfer Service service account for you. For more information about Google-managed service accounts, see Service accounts.

  1. To determine the name of your service account, go to the Storage Transfer Service API page.
  2. In the String field, enter your Google Cloud project ID.

    The name of the service account is typically in the following format: project-PROJECT_NUMBER@storage-transfer-service.iam.gserviceaccount.com

Creating your access policy in Access Context Manager

An access policy collects the service perimeters and access levels you create for your organization. An organization can only have one access policy.

  1. In the Google Cloud console, go to the Settings page.

    Go to the Settings page

  2. Make a note of your Google Cloud project ID and the organization name.

  3. In Cloud Shell, create a policy:

    gcloud access-context-manager policies create \
        --organization organization-id --title policy-title
    
    • organization-id is the organization ID that you found earlier.
    • policy-title is the title of the perimeter. For example, Example-Company-Access-Policy.

    The output is as follows:

    Create request issued
    Waiting for operation [accessPolicies/policy-title/create/policy-number] to complete...done.
    Created.
    

    policy-number represents a unique ID assigned to the policy title.

Creating your VPC Service Controls perimeter

When you create the VPC Service Controls perimeter, you start with no traffic allowed in. Then, you create an explicit access level to allow the transfer operation to send data into the controlled perimeter.

  1. In the Google Cloud console, go to the VPC Service Controls page.

    Go to the VPC Service Controls page

  2. Click New Perimeter.

  3. In the Name field, enter a name for the perimeter, such as data-transfer-perimeter.

  4. Leave Regular Perimeter selected.

  5. Click Add project and add the project that you created through this tutorial to the list of projects to protect.

  6. Click Cloud Storage API.

  7. Leave Access Levels at the default value.

  8. Click Save.

Creating an access level in the access policy

In this section, you limit access into the VPC through the service account.

  1. In Cloud Shell, create a YAML file called conditions.yaml that lists the principals that you want to provide access to:

     - members:
         - serviceAccount:project-project-number@storage-transfer-service.iam.gserviceaccount.com
         - user:sysadmin@example.com
     

  2. Create the access level:

    gcloud access-context-manager levels create name \
        --title title \
        --basic-level-spec ~./conditions.yaml \
        --combine-function=OR \
        --policy=policy-id
    
    • name is the unique name for the access level. It must begin with a letter and include only letters, numbers, and underscores.
    • title is a title that is unique to the policy, such as trusted-identity-ingest.
    • policy-id is the ID (number) of your organization's access policy.
    • combine-function is set to OR. The default value AND requires that all conditions be met before an access level is granted. The OR value gives the principals access even if other conditions, such as IP address or those inherited from other required access levels aren't met.

    The output is similar to the following:

    Create request issued for: name
    Waiting for operation [accessPolicies/policy-id/accessLevels/name/create/access-level-number] to complete...done.
    Created level name.
    

    access-level-number represents a unique ID assigned to the access level.

Binding the access level to VPC Service Controls

  1. In the Google Cloud console, go to VPC Service Controls.

    Go to the VPC Service Controls page

  2. Click Edit for VPC Service Control.

  3. Click Access Level and select the trusted-identity-ingest access level.

  4. Click Save.

Now the only operations that are allowed in the controlled perimeter are from the service account that you defined.

Initiating the transfer

  1. In the Google Cloud console, go to the Transfer page.

    Go to the Transfer page

  2. Click Create transfer.

  3. Click Amazon S3 bucket.

  4. In the Amazon S3 bucket field, enter the source Amazon S3 bucket name as it appears in the AWS Management Console.

  5. Enter the Access key ID and Secret key associated with the Amazon S3 bucket. You copied these values at the beginning of this tutorial.

  6. In Select destination, enter the name of the bucket that you created in your perimeter, such as project-id-destination-bucket.

  7. For Configure transfer, schedule your transfer job to Run now.

  8. Optional: Edit the transfer job name.

  9. For Description use a unique, descriptive name to help you identify your transfer job later.

  10. Click Create.

Clean up

To avoid incurring charges to your Google Cloud account for the resources used in this tutorial, either delete the project that contains the resources, or keep the project and delete the individual resources.

Delete the project

  1. In the Google Cloud console, go to the Manage resources page.

    Go to Manage resources

  2. In the project list, select the project that you want to delete, and then click Delete.
  3. In the dialog, type the project ID, and then click Shut down to delete the project.

What's next