创建转移作业

本页面介绍如何创建和启动转移作业。

如需查看 Storage Transfer Service 是否支持您的来源和目标位置(也称为接收器),请参阅支持的来源和接收器

代理和代理池

根据您的来源和目标位置,您可能需要在有权访问您的来源或目标位置的机器上创建和配置代理池并安装代理。

  • 如果您从 Amazon S3、Microsoft Azure、网址列表或 Cloud Storage 转移到 Cloud Storage,则不需要代理和代理池

  • 如果您的转移作业的来源和/或目标位置是文件系统,或者来源是兼容 S3 的存储空间,则需要具有代理和代理池。如需查看相关说明,请参阅管理代理池

准备工作

在配置转移作业之前,请确保您已配置好以下访问权限:

如果您使用 gcloud 命令,请安装 gcloud CLI

创建转移作业

请勿在转移作业名称中包含敏感信息,例如个人身份信息 (PII) 或安全数据。资源名称可能会传播到其他 Google Cloud 资源的名称,并且可能会向您项目之外的 Google 内部系统公开。

Google Cloud 控制台

  1. 转到 Google Cloud 控制台中的 Storage Transfer Service 页面。

    转到 Storage Transfer Service

  2. 点击创建转移作业。 系统随即会显示创建转移作业页面。

  3. 选择一个数据源:

    Cloud Storage

    您的用户账号必须具有 storage.buckets.get 权限才能选择来源和目标存储桶。或者,您可以直接输入存储桶的名称。如需了解详情,请参阅排查访问权限问题

    1. 来源类型下,选择 Cloud Storage

    2. 选择目标类型

    3. 如果您的目标是 Cloud Storage,请选择时间安排模式批量转移是一次性或按计划执行的。事件驱动型转移会持续监控来源,并在添加或修改数据时转移数据。

      如需配置事件驱动型转移,请按照事件驱动型转移中的说明操作。

    4. 点击下一步

    5. 执行以下任一操作,选择一个存储桶,您也可以在该存储桶中选择一个文件夹:

      • 存储桶或文件夹字段中输入一个现有 Cloud Storage 存储桶的名称和路径,不加前缀 gs://。例如 my-test-bucket/path/to/files。如需从另一个项目指定 Cloud Storage 存储桶,请在存储桶名称字段中准确输入名称。

      • 通过点击浏览,在您的项目中选择现有存储桶的列表,然后选择一个存储桶。

        点击浏览时,您可以通过点击项目 ID 并选择新项目 ID 和存储桶来选择其他项目中的存储桶。

      • 如需创建新存储桶,请点击 创建新存储桶

    6. 如果是事件驱动型转移,请输入 Pub/Sub 订阅名称,格式如下:

      projects/PROJECT_NAME/subscriptions/SUBSCRIPTION_ID
      
    7. (可选)选择按前缀或上次修改日期过滤对象。如果您将某个文件夹指定为源位置,则前缀过滤条件是相对于该文件夹而言的。例如,如果您的来源是 my-test-bucket/path/,则 file 的包含过滤器会包含以 my-test-bucket/path/file 开头的所有文件。
    8. 点击下一步

    Amazon S3

    1. 来源类型下,选择 Amazon S3

    2. 目标类型中,选择 Google Cloud Storage

    3. 选择时间安排模式批量转移是一次性或按计划执行的。事件驱动型转移会持续监控来源,并在添加或修改数据时转移数据。

      如需配置事件驱动型转移,请按照事件驱动型转移中的说明操作。

    4. 点击下一步

    5. 存储桶或文件夹名称字段中,输入数据源存储桶名称。

      存储桶名称是其在 AWS 管理控制台中显示的名称。

    6. 如果您使用 CloudFront 发行版从 S3 转移,请在 CloudFront 网域字段中输入发行版域名。例如 https://dy1h2n3l4ob56.cloudfront.net。请参阅通过 CloudFront 从 S3 传输数据以配置 CloudFront 发行版。

    7. 选择 Amazon Web Services (AWS) 身份验证方法。您可以为身份联合提供 AWS 访问密钥或 Amazon 资源名称 (ARN):

      • 访问密钥:在访问密钥 ID 字段中输入访问密钥,在私有访问密钥字段中输入与访问密钥关联的 Secret。

      • ARN:在 AWS IAM 角色 ARN 字段中输入您的 ARN,语法如下:

        arn:aws:iam::ACCOUNT:role/ROLE-NAME-WITH-PATH
        

        其中:

        • ACCOUNT:无连字符的 AWS 账号 ID。
        • ROLE-NAME-WITH-PATH:AWS 角色名称,包括路径。

        如需详细了解 ARN,请参阅 IAM ARN

    8. 如果是事件驱动型转移,请输入 Amazon SQS 队列 ARN,格式如下:

      arn:aws:sqs:us-east-1:1234567890:event-queue
      
    9. (可选)选择按前缀或上次修改日期过滤对象。如果您将某个文件夹指定为源位置,则前缀过滤条件是相对于该文件夹而言的。例如,如果您的来源是 my-test-bucket/path/,则 file 的包含过滤器会包含以 my-test-bucket/path/file 开头的所有文件。
    10. 点击下一步

    兼容 S3 的存储空间

    1. 来源类型下,选择兼容 S3 的对象存储空间

    2. 点击下一步

    3. 指定此转移作业所需的信息:

      1. 选择您为此转移作业配置的代理池

      2. 输入相对于端点的存储桶名称。例如,如果您的数据位于:

        https://us-east-1.example.com/folder1/bucket_a

        输入:folder1/bucket_a

      3. 输入端点。请勿包含协议(http://https://)。根据上一步中的示例,您应输入:

        us-east-1.example.com

    4. 为此转移作业指定任何可选特性:

      1. 输入用于对请求签名的签名区域。如果您的来源不需要签名区域,请将此字段留空。

      2. 选择此请求的签名流程

      3. 选择寻址样式。这决定了存储桶名称是以路径样式(例如 https://s3.region.example.com/bucket-name/key-name)还是虚拟托管样式(例如 https://bucket-name.s3.region.example.com/key-name)提供。如需了解详情,请参阅 Amazon 文档中的[存储桶的虚拟托管]。

      4. 选择网络协议

      5. 选择要使用的 Listing API 版本。如需了解详情,请参阅 ListObjectsV2ListObjects 文档。

    5. 点击下一步

    Microsoft Azure Blob Storage

    1. 来源类型下,选择 Azure Blob Storage 或 Data Lake Storage Gen2

    2. 点击下一步

    3. 指定下列内容:

      1. 存储账号名称 - 源 Microsoft Azure Storage 账号名称。

        存储账号名称显示在 Microsoft Azure Storage 门户中,位于所有服务 > 存储 > 存储账号下。

      2. 容器名称 - Microsoft Azure Storage 容器名称。

        容器名称显示在 Microsoft Azure Storage 门户中,位于 Storage Explorer > Blob 容器下。

      3. 共享访问签名 (SAS) - 从存储的访问权限政策创建的 Microsoft Azure Storage SAS 令牌。如需了解详情,请参阅使用共享访问签名 (SAS) 授予对 Azure Storage 资源的受限访问权限

        SAS 令牌的默认到期时间为 8 小时。在创建 SAS 令牌时,请确保设置合理的到期时间,以便成功完成转移。
    4. (可选)选择按前缀或上次修改日期过滤对象。如果您将某个文件夹指定为源位置,则前缀过滤条件是相对于该文件夹而言的。例如,如果您的来源是 my-test-bucket/path/,则 file 的包含过滤器会包含以 my-test-bucket/path/file 开头的所有文件。
    5. 点击下一步

    文件系统

    1. 来源类型下,选择 POSIX 文件系统

    2. 选择目标类型,然后点击下一步

    3. 您可以选择一个现有代理池,也可以选择创建代理池,然后按照说明创建一个新池。

    4. 指定文件系统目录的完全限定路径。

    5. 点击下一步

    HDFS

    请参阅从 HDFS 转移到 Cloud Storage

    网址列表

    1. 来源类型下,选择网址列表,然后点击下一步

    2. TSV 文件网址下,提供制表符分隔值 (TSV) 文件的网址。如需详细了解如何创建 TSV 文件,请参阅创建网址列表

    3. (可选)选择按前缀或上次修改日期过滤对象。如果您将某个文件夹指定为源位置,则前缀过滤条件是相对于该文件夹而言的。例如,如果您的来源是 my-test-bucket/path/,则 file 的包含过滤器会包含以 my-test-bucket/path/file 开头的所有文件。
    4. 点击下一步

  4. 选择目标:

    Cloud Storage​

    1. 存储桶或文件夹字段中,输入目标存储桶以及(可选)文件夹名称,或点击浏览从当前项目的现有存储桶列表中选择一个存储桶。如需创建新存储桶,请点击 创建新存储桶

    2. 点击下一步

    3. 选择转移作业的设置。某些选项仅适用于特定来源/接收器组合。

      1. 说明字段中,输入转移作业的说明。最佳做法是输入有意义且唯一的说明,以便区分作业。

      2. 元数据选项下,您可以选择使用默认选项,也可以点击查看和选择选项为所有受支持的元数据指定值。如需了解详情,请参阅元数据保留

      3. 何时覆盖下,选择以下选项之一:

        • 如果不同:如果同名的源文件具有不同的 ETag 或校验和值,则覆盖目标文件。

        • 始终:当源文件同名时,即使源文件相同,也始终覆盖目标文件。

      4. 何时删除下,选择以下选项之一:

        • 永不:永不从源或目标中删除文件。

        • 在转移文件后从数据源删除文件:将文件转移到目标位置后,从数据源中删除文件。

        • 如果文件不在数据源中则从目标位置删除文件:如果目标 Cloud Storage 存储桶中的文件也不在数据源中,则从 Cloud Storage 存储桶中删除文件。

          此选项可确保目标 Cloud Storage 存储桶与您的数据源完全匹配。

      5. 通知选项下,选择 Pub/Sub 主题以及要通知的事件。如需了解详情,请参阅 Pub/Sub 通知

    4. 点击下一步

    文件系统

    1. 您可以选择一个现有代理池,也可以选择创建代理池,然后按照说明创建一个新池。

    2. 指定目标目录的完全限定路径。

    3. 点击下一步

  5. 选择时间安排选项:

    1. 运行一次下拉列表中,选择以下选项之一:

      • 运行一次:从您选择的时间开始运行一次转移作业。

      • 每天运行:每天从您选择的时间开始,每天运行转移作业。

        您可以输入可选的结束日期,或者将结束日期留空以持续运行转移作业。

      • 每周运行:从您选择的时间开始,每周运行一次转移作业。

      • 按自定义频率运行:按您选择的频率运行转移作业。您可以选择按照小时的固定间隔来重复运行转移作业。

        您可以输入可选的结束日期,或者将结束日期留空以持续运行转移作业。

    2. 立即开始下拉列表中,选择以下选项之一:

      • 立即开始:点击创建后开始转移作业。

      • 开始日期:在您选择的日期和时间开始转移作业。点击日历,以显示一个日历来选择开始日期。

    3. 要创建转移作业,请点击创建

gcloud CLI

如需创建新的转移作业,请使用 gcloud transfer jobs create 命令。除非指定了时间表或 --do-not-run,否则创建新作业时会启动指定的转移作业。

gcloud transfer jobs create \
  SOURCE DESTINATION

其中:

  • SOURCE 是此转移作业的数据源。每个来源的格式可以为:

    • Cloud Storage:gs://BUCKET_NAME。如需从特定文件夹进行转移,请指定 gs://BUCKET_NAME/FOLDER_PATH/,包括尾部斜杠。
    • Amazon S3:s3://BUCKET_NAME/FOLDER_PATH
    • 兼容 S3 的存储空间:s3://BUCKET_NAME。存储桶名称是相对于端点分配的。例如,如果您的数据位于 https://us-east-1.example.com/folder1/bucket_a,请输入 s3://folder1/bucket_a
    • Microsoft Azure Storage: https://myaccount.blob.core.windows.net/CONTAINER_NAME
    • 网址列表:https://PATH_TO_URL_LISThttp://PATH_TO_URL_LIST
    • POSIX 文件系统:posix:///PATH。必须是以代理宿主机根目录开头的绝对路径。
    • HDFS:hdfs:///PATH
  • DESTINATION 可为以下项之一:

    • Cloud Storage:gs://BUCKET_NAME。如需转移到特定目录,请指定 gs://BUCKET_NAME/FOLDER_PATH/,包括尾部斜杠。
    • POSIX 文件系统:posix:///PATH。必须是以代理宿主机根目录开头的绝对路径。

如果您的转移作业需要使用转移代理,则您可以使用以下选项:

  • --source-agent-pool 指定要用于此转移作业的来源代理池。如果您从文件系统进行转移,则必须指定该选项。

  • --destination-agent-pool 指定要用于此转移作业的目标代理池。如果您要转移到文件系统,则必须指定该选项。

  • --intermediate-storage-path 是 Cloud Storage 存储桶的路径,格式为 gs://my-intermediary-bucket。如果您要在两个文件系统之间进行转移,则必须指定该选项。如需详细了解如何创建中间存储桶,请参阅创建 Cloud Storage 存储桶作为中间存储桶

其他选项包括:

  • --source-creds-file 指定机器上本地文件的相对路径,该文件包含转移作业来源的 AWS 或 Azure 凭据。如需了解凭据文件格式,请参阅 TransferSpec 参考文档。

  • --do-not-run 可阻止 Storage Transfer Service 在提交命令后运行作业。如需运行作业,请更新作业以添加时间表,或使用 jobs run手动启动作业

  • --manifest-file 指定 Cloud Storage 中 CSV 文件的路径,该文件包含要从来源转移的文件的列表。如需了解清单文件的格式,请参阅使用清单转移特定文件或对象

  • 作业信息:您可以指定 --name--description--source-creds-file

  • 时间表:您可以指定 --schedule-starts--schedule-repeats-every--schedule-repeats-until--do-not-run

  • 对象条件:您可以使用条件确定要转移的对象。这些条件包括 --include-prefixes--exclude-prefixes 以及 --include-modified-[before | after]-[absolute | relative] 中基于时间的条件。 如果您指定了包含来源的文件夹,则前缀过滤条件是相对于该文件夹而言的。如需了解详情,请参阅按前缀过滤源对象

    涉及文件系统的传输不支持对象条件。

  • 转移选项:指定是否覆盖目标文件(--overwrite-when=differentalways),以及是否要在转移过程中或之后删除某些文件(--delete-from=destination-if-uniquesource-after-transfer);指定要保留的元数据值 (--preserve-metadata);(可选)为转移的对象设置存储类别 (--custom-storage-class)。

  • 通知:使用 --notification-pubsub-topic--notification-event-types--notification-payload-format 为转移作业配置 Pub/Sub 通知

  • Cloud Logging:使用 --log-actions--log-action-states 为无代理转移作业或来自兼容 S3 的来源的转移作业启用 Cloud Logging。如需了解详情,请参阅适用于 Storage Transfer Service 的 Cloud Logging

从兼容 S3 的来源转移时,您还可以使用以下选项:

  • --source-endpoint(必需)指定存储系统的端点。例如 s3.example.com。请向您的提供商确认正确的格式。请勿指定协议(http://https://)。
  • --source-signing-region 指定签名请求的区域。如果您的存储提供商不需要签名区域,请省略此标志。
  • --source-auth-method 指定要使用的身份验证方法。有效值为 AWS_SIGNATURE_V2AWS_SIGNATURE_V4。如需了解详情,请参阅 Amazon 的 SigV4SigV2 文档。
  • --source-request-model 指定要使用的寻址样式。有效值为 PATH_STYLEVIRTUAL_HOSTED_STYLE。 路径样式使用格式 https://s3.example.com/BUCKET_NAME/KEY_NAME。虚拟托管样式使用格式 https://BUCKET_NAME.s3.example.com/KEY_NAME
  • --source-network-protocol 指定代理应该为此作业使用的网络协议。有效值为 HTTPHTTPS
  • --source-list-api 指定用于从存储桶返回对象的 S3 Listing API 的版本。有效值为 LIST_OBJECTSLIST_OBJECTS_V2。如需了解详情,请参阅 Amazon 的 ListObjectsV2ListObjects 文档。

如需查看所有选项,请运行 gcloud transfer jobs create --help 或参阅 gcloud 参考文档

示例

Amazon S3 到 Cloud Storage

运行以下命令以创建一个每日转移作业,将过去 1 天内修改的数据从 s3://my-s3-bucket 移动到名为 my-storage-bucket 的 Cloud Storage 存储桶:

gcloud transfer jobs create \
  s3://my-s3-bucket gs://my-storage-bucket \
  --source-creds-file=/examplefolder/creds.txt \
  --include-modified-after-relative=1d \
  --schedule-repeats-every=1d

兼容 S3 的存储空间到 Cloud Storage

如需从兼容 S3 的来源转移到 Cloud Storage 存储桶:

gcloud transfer jobs create s3://my-s3-compat-bucket gs://my-storage-bucket \
  --source-agent-pool=my-source-agent-pool
  --source-endpoint=us-east-1.example.com/ \
  --source-auth-method=AWS_SIGNATURE_V2 \
  --source-request-model=PATH_STYLE \
  --source-network-protocol=HTTPS \
  --source-list-api=LIST_OBJECTS_V2

文件系统到 Cloud Storage

如需从文件系统转移到 Cloud Storage 存储桶,请指定以下项。

gcloud transfer jobs create \
  posix:///tmp/source gs://my-storage-bucket \
  --source-agent-pool=my-source-agent-pool

Cloud Storage 到文件系统

如需从 Cloud Storage 存储桶转移到文件系统,请指定以下项。

gcloud transfer jobs create \
  gs://my-storage-bucket posix:///tmp/destination \
  --destination-agent-pool=my-destination-agent-pool

文件系统到文件系统

如需在两个文件系统之间进行转移,您必须指定来源代理池、目标代理池以及一个数据通过其进行传递的中间 Cloud Storage 存储桶。

如需详细了解中间存储桶,请参阅创建 Cloud Storage 存储桶作为中间存储桶

然后,在调用 transfer jobs create 时指定这 3 个资源:

gcloud transfer jobs create \
  posix:///tmp/source/on/systemA posix:///tmp/destination/on/systemB \
  --source-agent-pool=source_agent_pool \
  --destination-agent-pool=destination_agent_pool \
  --intermediate-storage-path=gs://my-intermediary-bucket

REST

以下示例展示了如何通过 REST API 使用 Storage Transfer Service。

使用 Storage Transfer Service API 配置或修改转移作业时,时间必须采用世界协调时间 (UTC)。如需详细了解如何指定转移作业的时间表,请参阅时间表

在 Cloud Storage 存储桶之间传输数据

本示例演示了如何将文件从一个 Cloud Storage 存储桶传输到另一个 Cloud Storage 存储桶。例如,您可以将数据移动到位于另一个位置的存储桶。

使用 transferJobs create 发出请求:

POST https://storagetransfer.googleapis.com/v1/transferJobs
{
  "description": "YOUR DESCRIPTION",
  "status": "ENABLED",
  "projectId": "PROJECT_ID",
  "schedule": {
      "scheduleStartDate": {
          "day": 1,
          "month": 1,
          "year": 2015
      },
      "startTimeOfDay": {
          "hours": 1,
          "minutes": 1
      }
  },
  "transferSpec": {
      "gcsDataSource": {
          "bucketName": "GCS_SOURCE_NAME"
      },
      "gcsDataSink": {
          "bucketName": "GCS_SINK_NAME"
      },
      "transferOptions": {
          "deleteObjectsFromSourceAfterTransfer": true
      }
  }
}
响应:
200 OK
{
  "transferJob": [
      {
          "creationTime": "2015-01-01T01:01:00.000000000Z",
          "description": "YOUR DESCRIPTION",
          "name": "transferJobs/JOB_ID",
          "status": "ENABLED",
          "lastModificationTime": "2015-01-01T01:01:00.000000000Z",
          "projectId": "PROJECT_ID",
          "schedule": {
              "scheduleStartDate": {
                  "day": 1,
                  "month": 1,
                  "year": 2015
              },
              "startTimeOfDay": {
                  "hours": 1,
                  "minutes": 1
              }
          },
          "transferSpec": {
              "gcsDataSource": {
                  "bucketName": "GCS_SOURCE_NAME",
              },
              "gcsDataSink": {
                  "bucketName": "GCS_NEARLINE_SINK_NAME"
              },
              "objectConditions": {
                  "minTimeElapsedSinceLastModification": "2592000.000s"
              },
              "transferOptions": {
                  "deleteObjectsFromSourceAfterTransfer": true
              }
          }
      }
  ]
}

将数据从 Amazon S3 传输到 Cloud Storage

本示例演示了如何将文件从 Amazon S3 传输到 Cloud Storage 存储桶。请务必查看配置对 Amazon S3 的访问权限价格部分,以了解将数据从 Amazon S3 转移到 Cloud Storage 会对您产生哪些影响。

创建转移作业时,请勿在 Amazon S3 存储桶来源名称中为 bucketName 添加 s3:// 前缀。

使用 transferJobs create 发出请求:

POST https://storagetransfer.googleapis.com/v1/transferJobs
{
  "description": "YOUR DESCRIPTION",
  "status": "ENABLED",
  "projectId": "PROJECT_ID",
  "schedule": {
      "scheduleStartDate": {
          "day": 1,
          "month": 1,
          "year": 2015
      },
      "scheduleEndDate": {
          "day": 1,
          "month": 1,
          "year": 2015
      },
      "startTimeOfDay": {
          "hours": 1,
          "minutes": 1
      }
  },
  "transferSpec": {
      "awsS3DataSource": {
          "bucketName": "AWS_SOURCE_NAME",
          "awsAccessKey": {
              "accessKeyId": "AWS_ACCESS_KEY_ID",
              "secretAccessKey": "AWS_SECRET_ACCESS_KEY"
          }
      },
      "gcsDataSink": {
          "bucketName": "GCS_SINK_NAME"
      }
  }
}
响应:
200 OK
{
  "transferJob": [
      {
          "creationTime": "2015-01-01T01:01:00.000000000Z",
          "description": "YOUR DESCRIPTION",
          "name": "transferJobs/JOB_ID",
          "status": "ENABLED",
          "lastModificationTime": "2015-01-01T01:01:00.000000000Z",
          "projectId": "PROJECT_ID",
          "schedule": {
              "scheduleStartDate": {
                  "day": 1,
                  "month": 1,
                  "year": 2015
              },
              "scheduleEndDate": {
                  "day": 1,
                  "month": 1,
                  "year": 2015
              },
              "startTimeOfDay": {
                  "hours": 1,
                  "minutes": 1
              }
          },
          "transferSpec": {
              "awsS3DataSource": {
                  "bucketName": "AWS_SOURCE_NAME"
              },
              "gcsDataSink": {
                  "bucketName": "GCS_SINK_NAME"
              },
              "objectConditions": {},
              "transferOptions": {}
          }
      }
  ]
}

如果您要通过 CloudFront 发行版从 S3 转移,请将发行域名指定为 transferSpec.awsS3DataSource.cloudfrontDomain 字段的值:

"transferSpec": {
  "awsS3DataSource": {
    "bucketName": "AWS_SOURCE_NAME",
    "cloudfrontDomain": "https://dy1h2n3l4ob56.cloudfront.net",
    "awsAccessKey": {
      "accessKeyId": "AWS_ACCESS_KEY_ID",
      "secretAccessKey": "AWS_SECRET_ACCESS_KEY"
    }
  },
  ...
}

在 Microsoft Azure Blob Storage 和 Cloud Storage 之间转移

本示例演示了如何使用 Microsoft Azure Storage 共享访问签名 (SAS) 令牌将文件从 Microsoft Azure Storage 移至 Cloud Storage 存储桶。

如需详细了解 Microsoft Azure Storage SAS,请参阅使用共享访问签名 (SAS) 授予对 Azure Storage 资源的有限访问权限

在开始之前,请查看配置对 Microsoft Azure Storage 的访问权限价格部分,以了解将数据从 Microsoft Azure Storage 转移到 Cloud Storage 会对您产生哪些影响。

使用 transferJobs create 发出请求:

POST https://storagetransfer.googleapis.com/v1/transferJobs
{
  "description": "YOUR DESCRIPTION",
  "status": "ENABLED",
  "projectId": "PROJECT_ID",
  "schedule": {
      "scheduleStartDate": {
          "day": 14,
          "month": 2,
          "year": 2020
      },
      "scheduleEndDate": {
          "day": 14
          "month": 2,
          "year": 2020
      },
      "startTimeOfDay": {
          "hours": 1,
          "minutes": 1
      }
  },
  "transferSpec": {
      "azureBlobStorageDataSource": {
          "storageAccount": "AZURE_SOURCE_NAME",
          "azureCredentials": {
              "sasToken": "AZURE_SAS_TOKEN",
          },
          "container": "AZURE_CONTAINER",
      },
      "gcsDataSink": {
          "bucketName": "GCS_SINK_NAME"
      }
  }
}
响应:
200 OK
{
  "transferJob": [
      {
          "creationTime": "2020-02-14T01:01:00.000000000Z",
          "description": "YOUR DESCRIPTION",
          "name": "transferJobs/JOB_ID",
          "status": "ENABLED",
          "lastModificationTime": "2020-02-14T01:01:00.000000000Z",
          "projectId": "PROJECT_ID",
          "schedule": {
              "scheduleStartDate": {
                  "day": 14
                  "month": 2,
                  "year": 2020
              },
              "scheduleEndDate": {
                  "day": 14,
                  "month": 2,
                  "year": 2020
              },
              "startTimeOfDay": {
                  "hours": 1,
                  "minutes": 1
              }
          },
          "transferSpec": {
              "azureBlobStorageDataSource": {
                  "storageAccount": "AZURE_SOURCE_NAME",
                  "azureCredentials": {
                      "sasToken": "AZURE_SAS_TOKEN",
                  },
                  "container": "AZURE_CONTAINER",
              },
              "objectConditions": {},
              "transferOptions": {}
          }
      }
  ]
}

从文件系统转移

本示例演示了如何将文件从 POSIX 文件系统移至 Cloud Storage 存储桶。

transferJobs.createposixDataSource 搭配使用:

POST https://storagetransfer.googleapis.com/v1/transferJobs
{
 "name":"transferJobs/sample_transfer",
 "description": "My First Transfer",
 "status": "ENABLED",
 "projectId": "my_transfer_project_id",
 "schedule": {
     "scheduleStartDate": {
         "year": 2022,
         "month": 5,
         "day": 2
     },
     "startTimeOfDay": {
         "hours": 22,
         "minutes": 30,
         "seconds": 0,
         "nanos": 0
     }
     "scheduleEndDate": {
         "year": 2022,
         "month": 12,
         "day": 31
     },
     "repeatInterval": {
         "259200s"
     },
 },
 "transferSpec": {
     "posixDataSource": {
          "rootDirectory": "/bar/",
     },
     "sourceAgentPoolName": "my_example_pool",
     "gcsDataSink": {
          "bucketName": "destination_bucket"
          "path": "foo/bar/"
     },
  }
}

schedule 字段为可选字段;如果不包含,则必须使用 transferJobs.run 请求启动转移作业。

如需在创建作业后检查转移作业的状态,请使用 transferJobs.get

GET https://storagetransfer.googleapis.com/v1/transferJobs/sample_transfer?project_id=my_transfer_project_id

指定来源路径和目标路径

来源路径和目标路径可让您在将数据转移到 Cloud Storage 存储桶时指定来源目录和目标目录。例如,假设您有文件 file1.txtfile2.txt,以及一个名为 B 的 Cloud Storage 存储桶。如果您设置了名为 my-stuff 的目标路径,则在转移完成后,您的文件将位于 gs://B/my-stuff/file1.txtgs://B/my-stuff/file2.txt

指定来源路径

如需在创建转移作业时指定来源路径,请在 TransferSpec 规范的 gcsDataSource 字段中添加 path 字段:

{
gcsDataSource: {
  bucketName: "SOURCE_BUCKET",
  path: "SOURCE_PATH/",
},
}

在此示例中:

  • SOURCE_BUCKET:来源 Cloud Storage 存储桶。
  • SOURCE_PATH:来源 Cloud Storage 路径。

指定目标路径

要在创建转移作业时指定目标文件夹,请在 TransferSpec 规范的 gcsDataSink 字段中添加 path 字段:

{
gcsDataSink: {
  bucketName: "DESTINATION_BUCKET",
  path: "DESTINATION_PATH/",
},
}

在此示例中:

  • DESTINATION_BUCKET:目标 Cloud Storage 存储桶。
  • DESTINATION_PATH:目标 Cloud Storage 路径。

完整的示例请求

以下是完整请求的示例:

POST https://storagetransfer.googleapis.com/v1/transferJobs
{
  "description": "YOUR DESCRIPTION",
  "status": "ENABLED",
  "projectId": "PROJECT_ID",
  "schedule": {
      "scheduleStartDate": {
          "day": 1,
          "month": 1,
          "year": 2015
      },
      "startTimeOfDay": {
          "hours": 1,
          "minutes": 1
      }
  },
  "transferSpec": {
      "gcsDataSource": {
          "bucketName": "GCS_SOURCE_NAME",
          "path": "GCS_SOURCE_PATH",
      },
      "gcsDataSink": {
          "bucketName": "GCS_SINK_NAME",
          "path": "GCS_SINK_PATH",
      },
      "objectConditions": {
          "minTimeElapsedSinceLastModification": "2592000s"
      },
      "transferOptions": {
          "deleteObjectsFromSourceAfterTransfer": true
      }
  }

}

客户端库

以下示例展示了如何通过 Go、Java、Node.js 和 Python 以编程方式使用 Storage Transfer Service。

以编程方式配置或修改转移作业时,必须采用世界协调时间 (UTC)。如需详细了解如何指定转移作业的时间表,请参阅时间表

如需详细了解 Storage Transfer Service 客户端库,请参阅 Storage Transfer Service 客户端库使用入门

在 Cloud Storage 存储分区之间传输数据

本示例演示了如何将文件从一个 Cloud Storage 存储桶传输到另一个 Cloud Storage 存储桶。例如,您可以将数据移动到位于另一个位置的存储桶。

Go

import (
	"context"
	"fmt"
	"io"
	"time"

	"google.golang.org/genproto/googleapis/type/date"
	"google.golang.org/genproto/googleapis/type/timeofday"
	"google.golang.org/protobuf/types/known/durationpb"

	storagetransfer "cloud.google.com/go/storagetransfer/apiv1"
	"cloud.google.com/go/storagetransfer/apiv1/storagetransferpb"
)

func transferToNearline(w io.Writer, projectID string, gcsSourceBucket string, gcsNearlineSinkBucket string) (*storagetransferpb.TransferJob, error) {
	// Your Google Cloud Project ID
	// projectID := "my-project-id"

	// The name of the GCS bucket to transfer objects from
	// gcsSourceBucket := "my-source-bucket"

	// The name of the Nearline GCS bucket to transfer objects to
	// gcsNearlineSinkBucket := "my-sink-bucket"

	ctx := context.Background()
	client, err := storagetransfer.NewClient(ctx)
	if err != nil {
		return nil, fmt.Errorf("storagetransfer.NewClient: %w", err)
	}
	defer client.Close()

	// A description of this job
	jobDescription := "Transfers objects that haven't been modified in 30 days to a Nearline bucket"

	// The time to start the transfer
	startTime := time.Now().UTC()

	req := &storagetransferpb.CreateTransferJobRequest{
		TransferJob: &storagetransferpb.TransferJob{
			ProjectId:   projectID,
			Description: jobDescription,
			TransferSpec: &storagetransferpb.TransferSpec{
				DataSink: &storagetransferpb.TransferSpec_GcsDataSink{
					GcsDataSink: &storagetransferpb.GcsData{BucketName: gcsNearlineSinkBucket}},
				DataSource: &storagetransferpb.TransferSpec_GcsDataSource{
					GcsDataSource: &storagetransferpb.GcsData{BucketName: gcsSourceBucket},
				},
				ObjectConditions: &storagetransferpb.ObjectConditions{
					MinTimeElapsedSinceLastModification: &durationpb.Duration{Seconds: 2592000 /*30 days */},
				},
				TransferOptions: &storagetransferpb.TransferOptions{DeleteObjectsFromSourceAfterTransfer: true},
			},
			Schedule: &storagetransferpb.Schedule{
				ScheduleStartDate: &date.Date{
					Year:  int32(startTime.Year()),
					Month: int32(startTime.Month()),
					Day:   int32(startTime.Day()),
				},
				ScheduleEndDate: &date.Date{
					Year:  int32(startTime.Year()),
					Month: int32(startTime.Month()),
					Day:   int32(startTime.Day()),
				},
				StartTimeOfDay: &timeofday.TimeOfDay{
					Hours:   int32(startTime.Hour()),
					Minutes: int32(startTime.Minute()),
					Seconds: int32(startTime.Second()),
				},
			},
			Status: storagetransferpb.TransferJob_ENABLED,
		},
	}
	resp, err := client.CreateTransferJob(ctx, req)
	if err != nil {
		return nil, fmt.Errorf("failed to create transfer job: %w", err)
	}
	if _, err = client.RunTransferJob(ctx, &storagetransferpb.RunTransferJobRequest{
		ProjectId: projectID,
		JobName:   resp.Name,
	}); err != nil {
		return nil, fmt.Errorf("failed to run transfer job: %w", err)
	}
	fmt.Fprintf(w, "Created and ran transfer job from %v to %v with name %v", gcsSourceBucket, gcsNearlineSinkBucket, resp.Name)
	return resp, nil
}

Java

想要寻找较早的示例?请参阅 Storage Transfer Service 迁移指南

import com.google.protobuf.Duration;
import com.google.storagetransfer.v1.proto.StorageTransferServiceClient;
import com.google.storagetransfer.v1.proto.TransferProto.CreateTransferJobRequest;
import com.google.storagetransfer.v1.proto.TransferTypes.GcsData;
import com.google.storagetransfer.v1.proto.TransferTypes.ObjectConditions;
import com.google.storagetransfer.v1.proto.TransferTypes.Schedule;
import com.google.storagetransfer.v1.proto.TransferTypes.TransferJob;
import com.google.storagetransfer.v1.proto.TransferTypes.TransferJob.Status;
import com.google.storagetransfer.v1.proto.TransferTypes.TransferOptions;
import com.google.storagetransfer.v1.proto.TransferTypes.TransferSpec;
import com.google.type.Date;
import com.google.type.TimeOfDay;
import java.io.IOException;
import java.util.Calendar;

public class TransferToNearline {
  /**
   * Creates a one-off transfer job that transfers objects in a standard GCS bucket that are more
   * than 30 days old to a Nearline GCS bucket.
   */
  public static void transferToNearline(
      String projectId,
      String jobDescription,
      String gcsSourceBucket,
      String gcsNearlineSinkBucket,
      long startDateTime)
      throws IOException {

    // Your Google Cloud Project ID
    // String projectId = "your-project-id";

    // A short description of this job
    // String jobDescription = "Sample transfer job of old objects to a Nearline GCS bucket.";

    // The name of the source GCS bucket to transfer data from
    // String gcsSourceBucket = "your-gcs-source-bucket";

    // The name of the Nearline GCS bucket to transfer old objects to
    // String gcsSinkBucket = "your-nearline-gcs-bucket";

    // What day and time in UTC to start the transfer, expressed as an epoch date timestamp.
    // If this is in the past relative to when the job is created, it will run the next day.
    // long startDateTime =
    //     new SimpleDateFormat("yyyy-MM-dd HH:mm:ss").parse("2000-01-01 00:00:00").getTime();

    // Parse epoch timestamp into the model classes
    Calendar startCalendar = Calendar.getInstance();
    startCalendar.setTimeInMillis(startDateTime);
    // Note that this is a Date from the model class package, not a java.util.Date
    Date date =
        Date.newBuilder()
            .setYear(startCalendar.get(Calendar.YEAR))
            .setMonth(startCalendar.get(Calendar.MONTH) + 1)
            .setDay(startCalendar.get(Calendar.DAY_OF_MONTH))
            .build();
    TimeOfDay time =
        TimeOfDay.newBuilder()
            .setHours(startCalendar.get(Calendar.HOUR_OF_DAY))
            .setMinutes(startCalendar.get(Calendar.MINUTE))
            .setSeconds(startCalendar.get(Calendar.SECOND))
            .build();

    TransferJob transferJob =
        TransferJob.newBuilder()
            .setDescription(jobDescription)
            .setProjectId(projectId)
            .setTransferSpec(
                TransferSpec.newBuilder()
                    .setGcsDataSource(GcsData.newBuilder().setBucketName(gcsSourceBucket))
                    .setGcsDataSink(GcsData.newBuilder().setBucketName(gcsNearlineSinkBucket))
                    .setObjectConditions(
                        ObjectConditions.newBuilder()
                            .setMinTimeElapsedSinceLastModification(
                                Duration.newBuilder().setSeconds(2592000 /* 30 days */)))
                    .setTransferOptions(
                        TransferOptions.newBuilder().setDeleteObjectsFromSourceAfterTransfer(true)))
            .setSchedule(Schedule.newBuilder().setScheduleStartDate(date).setStartTimeOfDay(time))
            .setStatus(Status.ENABLED)
            .build();

    // Create a Transfer Service client
    StorageTransferServiceClient storageTransfer = StorageTransferServiceClient.create();

    // Create the transfer job
    TransferJob response =
        storageTransfer.createTransferJob(
            CreateTransferJobRequest.newBuilder().setTransferJob(transferJob).build());

    System.out.println("Created transfer job from standard bucket to Nearline bucket:");
    System.out.println(response.toString());
  }
}

Node.js


// Imports the Google Cloud client library
const {
  StorageTransferServiceClient,
} = require('@google-cloud/storage-transfer');

/**
 * TODO(developer): Uncomment the following lines before running the sample.
 */
// The ID of the Google Cloud Platform Project that owns the job
// projectId = 'my-project-id'

// A useful description for your transfer job
// description = 'My transfer job'

// Google Cloud Storage source bucket name
// gcsSourceBucket = 'my-gcs-source-bucket'

// Google Cloud Storage destination bucket name
// gcsSinkBucket = 'my-gcs-destination-bucket'

// Date to start daily migration
// startDate = new Date()

// Creates a client
const client = new StorageTransferServiceClient();

/**
 * Create a daily migration from a GCS bucket to another GCS bucket for
 * objects untouched for 30+ days.
 */
async function createDailyNearline30DayMigration() {
  // Runs the request and creates the job
  const [transferJob] = await client.createTransferJob({
    transferJob: {
      projectId,
      description,
      status: 'ENABLED',
      schedule: {
        scheduleStartDate: {
          day: startDate.getDate(),
          month: startDate.getMonth() + 1,
          year: startDate.getFullYear(),
        },
      },
      transferSpec: {
        gcsDataSource: {
          bucketName: gcsSourceBucket,
        },
        gcsDataSink: {
          bucketName: gcsSinkBucket,
        },
        objectConditions: {
          minTimeElapsedSinceLastModification: {
            seconds: 2592000, // 30 days
          },
        },
        transferOptions: {
          deleteObjectsFromSourceAfterTransfer: true,
        },
      },
    },
  });

  console.log(`Created transferJob: ${transferJob.name}`);
}

createDailyNearline30DayMigration();

Python

想要寻找较早的示例?请参阅 Storage Transfer Service 迁移指南

from datetime import datetime

from google.cloud import storage_transfer
from google.protobuf.duration_pb2 import Duration

def create_daily_nearline_30_day_migration(
    project_id: str,
    description: str,
    source_bucket: str,
    sink_bucket: str,
    start_date: datetime,
):
    """Create a daily migration from a GCS bucket to a Nearline GCS bucket
    for objects untouched for 30 days."""

    client = storage_transfer.StorageTransferServiceClient()

    # The ID of the Google Cloud Platform Project that owns the job
    # project_id = 'my-project-id'

    # A useful description for your transfer job
    # description = 'My transfer job'

    # Google Cloud Storage source bucket name
    # source_bucket = 'my-gcs-source-bucket'

    # Google Cloud Storage destination bucket name
    # sink_bucket = 'my-gcs-destination-bucket'

    transfer_job_request = storage_transfer.CreateTransferJobRequest(
        {
            "transfer_job": {
                "project_id": project_id,
                "description": description,
                "status": storage_transfer.TransferJob.Status.ENABLED,
                "schedule": {
                    "schedule_start_date": {
                        "day": start_date.day,
                        "month": start_date.month,
                        "year": start_date.year,
                    }
                },
                "transfer_spec": {
                    "gcs_data_source": {
                        "bucket_name": source_bucket,
                    },
                    "gcs_data_sink": {
                        "bucket_name": sink_bucket,
                    },
                    "object_conditions": {
                        "min_time_elapsed_since_last_modification": Duration(
                            seconds=2592000  # 30 days
                        )
                    },
                    "transfer_options": {
                        "delete_objects_from_source_after_transfer": True
                    },
                },
            }
        }
    )

    result = client.create_transfer_job(transfer_job_request)
    print(f"Created transferJob: {result.name}")

将数据从 Amazon S3 传输到 Cloud Storage

本示例演示了如何将文件从 Amazon S3 传输到 Cloud Storage 存储桶。请务必查看配置对 Amazon S3 的访问权限价格部分,以了解将数据从 Amazon S3 转移到 Cloud Storage 会对您产生哪些影响。

创建转移作业时,请勿在 Amazon S3 存储桶来源名称中为 bucketName 添加 s3:// 前缀。

Go

import (
	"context"
	"fmt"
	"io"
	"os"
	"time"

	storagetransfer "cloud.google.com/go/storagetransfer/apiv1"
	"cloud.google.com/go/storagetransfer/apiv1/storagetransferpb"
	"google.golang.org/genproto/googleapis/type/date"
	"google.golang.org/genproto/googleapis/type/timeofday"
)

func transferFromAws(w io.Writer, projectID string, awsSourceBucket string, gcsSinkBucket string) (*storagetransferpb.TransferJob, error) {
	// Your Google Cloud Project ID
	// projectID := "my-project-id"

	// The name of the Aws bucket to transfer objects from
	// awsSourceBucket := "my-source-bucket"

	// The name of the GCS bucket to transfer objects to
	// gcsSinkBucket := "my-sink-bucket"

	ctx := context.Background()
	client, err := storagetransfer.NewClient(ctx)
	if err != nil {
		return nil, fmt.Errorf("storagetransfer.NewClient: %w", err)
	}
	defer client.Close()

	// A description of this job
	jobDescription := "Transfers objects from an AWS bucket to a GCS bucket"

	// The time to start the transfer
	startTime := time.Now().UTC()

	// The AWS access key credential, should be accessed via environment variable for security
	awsAccessKeyID := os.Getenv("AWS_ACCESS_KEY_ID")

	// The AWS secret key credential, should be accessed via environment variable for security
	awsSecretKey := os.Getenv("AWS_SECRET_ACCESS_KEY")

	req := &storagetransferpb.CreateTransferJobRequest{
		TransferJob: &storagetransferpb.TransferJob{
			ProjectId:   projectID,
			Description: jobDescription,
			TransferSpec: &storagetransferpb.TransferSpec{
				DataSource: &storagetransferpb.TransferSpec_AwsS3DataSource{
					AwsS3DataSource: &storagetransferpb.AwsS3Data{
						BucketName: awsSourceBucket,
						AwsAccessKey: &storagetransferpb.AwsAccessKey{
							AccessKeyId:     awsAccessKeyID,
							SecretAccessKey: awsSecretKey,
						}},
				},
				DataSink: &storagetransferpb.TransferSpec_GcsDataSink{
					GcsDataSink: &storagetransferpb.GcsData{BucketName: gcsSinkBucket}},
			},
			Schedule: &storagetransferpb.Schedule{
				ScheduleStartDate: &date.Date{
					Year:  int32(startTime.Year()),
					Month: int32(startTime.Month()),
					Day:   int32(startTime.Day()),
				},
				ScheduleEndDate: &date.Date{
					Year:  int32(startTime.Year()),
					Month: int32(startTime.Month()),
					Day:   int32(startTime.Day()),
				},
				StartTimeOfDay: &timeofday.TimeOfDay{
					Hours:   int32(startTime.Hour()),
					Minutes: int32(startTime.Minute()),
					Seconds: int32(startTime.Second()),
				},
			},
			Status: storagetransferpb.TransferJob_ENABLED,
		},
	}
	resp, err := client.CreateTransferJob(ctx, req)
	if err != nil {
		return nil, fmt.Errorf("failed to create transfer job: %w", err)
	}
	if _, err = client.RunTransferJob(ctx, &storagetransferpb.RunTransferJobRequest{
		ProjectId: projectID,
		JobName:   resp.Name,
	}); err != nil {
		return nil, fmt.Errorf("failed to run transfer job: %w", err)
	}
	fmt.Fprintf(w, "Created and ran transfer job from %v to %v with name %v", awsSourceBucket, gcsSinkBucket, resp.Name)
	return resp, nil
}

Java

想要寻找较早的示例?请参阅 Storage Transfer Service 迁移指南


import com.google.storagetransfer.v1.proto.StorageTransferServiceClient;
import com.google.storagetransfer.v1.proto.TransferProto.CreateTransferJobRequest;
import com.google.storagetransfer.v1.proto.TransferTypes.AwsAccessKey;
import com.google.storagetransfer.v1.proto.TransferTypes.AwsS3Data;
import com.google.storagetransfer.v1.proto.TransferTypes.GcsData;
import com.google.storagetransfer.v1.proto.TransferTypes.Schedule;
import com.google.storagetransfer.v1.proto.TransferTypes.TransferJob;
import com.google.storagetransfer.v1.proto.TransferTypes.TransferJob.Status;
import com.google.storagetransfer.v1.proto.TransferTypes.TransferSpec;
import com.google.type.Date;
import com.google.type.TimeOfDay;
import java.io.IOException;
import java.util.Calendar;

public class TransferFromAws {

  // Creates a one-off transfer job from Amazon S3 to Google Cloud Storage.
  public static void transferFromAws(
      String projectId,
      String jobDescription,
      String awsSourceBucket,
      String gcsSinkBucket,
      long startDateTime)
      throws IOException {

    // Your Google Cloud Project ID
    // String projectId = "your-project-id";

    // A short description of this job
    // String jobDescription = "Sample transfer job from S3 to GCS.";

    // The name of the source AWS bucket to transfer data from
    // String awsSourceBucket = "yourAwsSourceBucket";

    // The name of the GCS bucket to transfer data to
    // String gcsSinkBucket = "your-gcs-bucket";

    // What day and time in UTC to start the transfer, expressed as an epoch date timestamp.
    // If this is in the past relative to when the job is created, it will run the next day.
    // long startDateTime =
    //     new SimpleDateFormat("yyyy-MM-dd HH:mm:ss").parse("2000-01-01 00:00:00").getTime();

    // The ID used to access your AWS account. Should be accessed via environment variable.
    String awsAccessKeyId = System.getenv("AWS_ACCESS_KEY_ID");

    // The Secret Key used to access your AWS account. Should be accessed via environment variable.
    String awsSecretAccessKey = System.getenv("AWS_SECRET_ACCESS_KEY");

    // Set up source and sink
    TransferSpec transferSpec =
        TransferSpec.newBuilder()
            .setAwsS3DataSource(
                AwsS3Data.newBuilder()
                    .setBucketName(awsSourceBucket)
                    .setAwsAccessKey(
                        AwsAccessKey.newBuilder()
                            .setAccessKeyId(awsAccessKeyId)
                            .setSecretAccessKey(awsSecretAccessKey)))
            .setGcsDataSink(GcsData.newBuilder().setBucketName(gcsSinkBucket))
            .build();

    // Parse epoch timestamp into the model classes
    Calendar startCalendar = Calendar.getInstance();
    startCalendar.setTimeInMillis(startDateTime);
    // Note that this is a Date from the model class package, not a java.util.Date
    Date startDate =
        Date.newBuilder()
            .setYear(startCalendar.get(Calendar.YEAR))
            .setMonth(startCalendar.get(Calendar.MONTH) + 1)
            .setDay(startCalendar.get(Calendar.DAY_OF_MONTH))
            .build();
    TimeOfDay startTime =
        TimeOfDay.newBuilder()
            .setHours(startCalendar.get(Calendar.HOUR_OF_DAY))
            .setMinutes(startCalendar.get(Calendar.MINUTE))
            .setSeconds(startCalendar.get(Calendar.SECOND))
            .build();
    Schedule schedule =
        Schedule.newBuilder()
            .setScheduleStartDate(startDate)
            .setScheduleEndDate(startDate)
            .setStartTimeOfDay(startTime)
            .build();

    // Set up the transfer job
    TransferJob transferJob =
        TransferJob.newBuilder()
            .setDescription(jobDescription)
            .setProjectId(projectId)
            .setTransferSpec(transferSpec)
            .setSchedule(schedule)
            .setStatus(Status.ENABLED)
            .build();

    // Create a Transfer Service client
    StorageTransferServiceClient storageTransfer = StorageTransferServiceClient.create();

    // Create the transfer job
    TransferJob response =
        storageTransfer.createTransferJob(
            CreateTransferJobRequest.newBuilder().setTransferJob(transferJob).build());

    System.out.println("Created transfer job from AWS to GCS:");
    System.out.println(response.toString());
  }
}

Node.js


// Imports the Google Cloud client library
const {
  StorageTransferServiceClient,
} = require('@google-cloud/storage-transfer');

/**
 * TODO(developer): Uncomment the following lines before running the sample.
 */
// The ID of the Google Cloud Platform Project that owns the job
// projectId = 'my-project-id'

// A useful description for your transfer job
// description = 'My transfer job'

// AWS S3 source bucket name
// awsSourceBucket = 'my-s3-source-bucket'

// AWS Access Key ID
// awsAccessKeyId = 'AKIA...'

// AWS Secret Access Key
// awsSecretAccessKey = 'HEAoMK2.../...ku8'

// Google Cloud Storage destination bucket name
// gcsSinkBucket = 'my-gcs-destination-bucket'

// Creates a client
const client = new StorageTransferServiceClient();

/**
 * Creates a one-time transfer job from Amazon S3 to Google Cloud Storage.
 */
async function transferFromS3() {
  // Setting the start date and the end date as the same time creates a
  // one-time transfer
  const now = new Date();
  const oneTimeSchedule = {
    day: now.getDate(),
    month: now.getMonth() + 1,
    year: now.getFullYear(),
  };

  // Runs the request and creates the job
  const [transferJob] = await client.createTransferJob({
    transferJob: {
      projectId,
      description,
      status: 'ENABLED',
      schedule: {
        scheduleStartDate: oneTimeSchedule,
        scheduleEndDate: oneTimeSchedule,
      },
      transferSpec: {
        awsS3DataSource: {
          bucketName: awsSourceBucket,
          awsAccessKey: {
            accessKeyId: awsAccessKeyId,
            secretAccessKey: awsSecretAccessKey,
          },
        },
        gcsDataSink: {
          bucketName: gcsSinkBucket,
        },
      },
    },
  });

  console.log(
    `Created and ran a transfer job from '${awsSourceBucket}' to '${gcsSinkBucket}' with name ${transferJob.name}`
  );
}

transferFromS3();

Python

想要寻找较早的示例?请参阅 Storage Transfer Service 迁移指南

from datetime import datetime

from google.cloud import storage_transfer

def create_one_time_aws_transfer(
    project_id: str,
    description: str,
    source_bucket: str,
    aws_access_key_id: str,
    aws_secret_access_key: str,
    sink_bucket: str,
):
    """Creates a one-time transfer job from Amazon S3 to Google Cloud
    Storage."""

    client = storage_transfer.StorageTransferServiceClient()

    # The ID of the Google Cloud Platform Project that owns the job
    # project_id = 'my-project-id'

    # A useful description for your transfer job
    # description = 'My transfer job'

    # AWS S3 source bucket name
    # source_bucket = 'my-s3-source-bucket'

    # AWS Access Key ID
    # aws_access_key_id = 'AKIA...'

    # AWS Secret Access Key
    # aws_secret_access_key = 'HEAoMK2.../...ku8'

    # Google Cloud Storage destination bucket name
    # sink_bucket = 'my-gcs-destination-bucket'

    now = datetime.utcnow()
    # Setting the start date and the end date as
    # the same time creates a one-time transfer
    one_time_schedule = {"day": now.day, "month": now.month, "year": now.year}

    transfer_job_request = storage_transfer.CreateTransferJobRequest(
        {
            "transfer_job": {
                "project_id": project_id,
                "description": description,
                "status": storage_transfer.TransferJob.Status.ENABLED,
                "schedule": {
                    "schedule_start_date": one_time_schedule,
                    "schedule_end_date": one_time_schedule,
                },
                "transfer_spec": {
                    "aws_s3_data_source": {
                        "bucket_name": source_bucket,
                        "aws_access_key": {
                            "access_key_id": aws_access_key_id,
                            "secret_access_key": aws_secret_access_key,
                        },
                    },
                    "gcs_data_sink": {
                        "bucket_name": sink_bucket,
                    },
                },
            }
        }
    )

    result = client.create_transfer_job(transfer_job_request)
    print(f"Created transferJob: {result.name}")

在 Microsoft Azure Blob Storage 和 Cloud Storage 之间转移

本示例演示了如何使用 Microsoft Azure Storage 共享访问签名 (SAS) 令牌将文件从 Microsoft Azure Storage 移至 Cloud Storage 存储桶。

如需详细了解 Microsoft Azure Storage SAS,请参阅使用共享访问签名 (SAS) 授予对 Azure Storage 资源的有限访问权限

在开始之前,请查看配置对 Microsoft Azure Storage 的访问权限价格部分,以了解将数据从 Microsoft Azure Storage 转移到 Cloud Storage 会对您产生哪些影响。

Go

如需了解如何安装和使用 Storage Transfer Service 客户端库,请参阅 Storage Transfer Service 客户端库。 如需了解详情,请参阅 Storage Transfer Service Go API 参考文档

如需向 Storage Transfer Service 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

import (
	"context"
	"fmt"
	"io"
	"os"

	storagetransfer "cloud.google.com/go/storagetransfer/apiv1"
	"cloud.google.com/go/storagetransfer/apiv1/storagetransferpb"
)

func transferFromAzure(w io.Writer, projectID string, azureStorageAccountName string, azureSourceContainer string, gcsSinkBucket string) (*storagetransferpb.TransferJob, error) {
	// Your Google Cloud Project ID.
	// projectID := "my-project-id"

	// The name of your Azure Storage account.
	// azureStorageAccountName := "my-azure-storage-acc"

	// The name of the Azure container to transfer objects from.
	// azureSourceContainer := "my-source-container"

	// The name of the GCS bucket to transfer objects to.
	// gcsSinkBucket := "my-sink-bucket"

	ctx := context.Background()
	client, err := storagetransfer.NewClient(ctx)
	if err != nil {
		return nil, fmt.Errorf("storagetransfer.NewClient: %w", err)
	}
	defer client.Close()

	// The Azure SAS token, should be accessed via environment variable for security
	azureSasToken := os.Getenv("AZURE_SAS_TOKEN")

	req := &storagetransferpb.CreateTransferJobRequest{
		TransferJob: &storagetransferpb.TransferJob{
			ProjectId: projectID,
			TransferSpec: &storagetransferpb.TransferSpec{
				DataSource: &storagetransferpb.TransferSpec_AzureBlobStorageDataSource{
					AzureBlobStorageDataSource: &storagetransferpb.AzureBlobStorageData{
						StorageAccount: azureStorageAccountName,
						AzureCredentials: &storagetransferpb.AzureCredentials{
							SasToken: azureSasToken,
						},
						Container: azureSourceContainer,
					},
				},
				DataSink: &storagetransferpb.TransferSpec_GcsDataSink{
					GcsDataSink: &storagetransferpb.GcsData{BucketName: gcsSinkBucket}},
			},
			Status: storagetransferpb.TransferJob_ENABLED,
		},
	}
	resp, err := client.CreateTransferJob(ctx, req)
	if err != nil {
		return nil, fmt.Errorf("failed to create transfer job: %w", err)
	}
	if _, err = client.RunTransferJob(ctx, &storagetransferpb.RunTransferJobRequest{
		ProjectId: projectID,
		JobName:   resp.Name,
	}); err != nil {
		return nil, fmt.Errorf("failed to run transfer job: %w", err)
	}
	fmt.Fprintf(w, "Created and ran transfer job from %v to %v with name %v", azureSourceContainer, gcsSinkBucket, resp.Name)
	return resp, nil
}

Java

如需了解如何安装和使用 Storage Transfer Service 客户端库,请参阅 Storage Transfer Service 客户端库。 如需了解详情,请参阅 Storage Transfer Service Java API 参考文档

如需向 Storage Transfer Service 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

import com.google.storagetransfer.v1.proto.StorageTransferServiceClient;
import com.google.storagetransfer.v1.proto.TransferProto;
import com.google.storagetransfer.v1.proto.TransferProto.RunTransferJobRequest;
import com.google.storagetransfer.v1.proto.TransferTypes.AzureBlobStorageData;
import com.google.storagetransfer.v1.proto.TransferTypes.AzureCredentials;
import com.google.storagetransfer.v1.proto.TransferTypes.GcsData;
import com.google.storagetransfer.v1.proto.TransferTypes.TransferJob;
import com.google.storagetransfer.v1.proto.TransferTypes.TransferJob.Status;
import com.google.storagetransfer.v1.proto.TransferTypes.TransferSpec;
import java.io.IOException;
import java.util.concurrent.ExecutionException;

public class TransferFromAzure {
  public static void main(String[] args)
      throws IOException, ExecutionException, InterruptedException {
    // TODO(developer): Replace these variables before running the sample.
    // Your Google Cloud Project ID
    String projectId = "my-project-id";

    // Your Azure Storage Account name
    String azureStorageAccount = "my-azure-account";

    // The Azure source container to transfer data from
    String azureSourceContainer = "my-source-container";

    // The GCS bucket to transfer data to
    String gcsSinkBucket = "my-sink-bucket";

    transferFromAzureBlobStorage(
        projectId, azureStorageAccount, azureSourceContainer, gcsSinkBucket);
  }

  /**
   * Creates and runs a transfer job to transfer all data from an Azure container to a GCS bucket.
   */
  public static void transferFromAzureBlobStorage(
      String projectId,
      String azureStorageAccount,
      String azureSourceContainer,
      String gcsSinkBucket)
      throws IOException, ExecutionException, InterruptedException {

    // Your Azure SAS token, should be accessed via environment variable
    String azureSasToken = System.getenv("AZURE_SAS_TOKEN");

    TransferSpec transferSpec =
        TransferSpec.newBuilder()
            .setAzureBlobStorageDataSource(
                AzureBlobStorageData.newBuilder()
                    .setAzureCredentials(
                        AzureCredentials.newBuilder().setSasToken(azureSasToken).build())
                    .setContainer(azureSourceContainer)
                    .setStorageAccount(azureStorageAccount))
            .setGcsDataSink(GcsData.newBuilder().setBucketName(gcsSinkBucket).build())
            .build();

    TransferJob transferJob =
        TransferJob.newBuilder()
            .setProjectId(projectId)
            .setStatus(Status.ENABLED)
            .setTransferSpec(transferSpec)
            .build();

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources,
    // or use "try-with-close" statement to do this automatically.
    try (StorageTransferServiceClient storageTransfer = StorageTransferServiceClient.create()) {
      // Create the transfer job
      TransferJob response =
          storageTransfer.createTransferJob(
              TransferProto.CreateTransferJobRequest.newBuilder()
                  .setTransferJob(transferJob)
                  .build());

      // Run the created job
      storageTransfer
          .runTransferJobAsync(
              RunTransferJobRequest.newBuilder()
                  .setProjectId(projectId)
                  .setJobName(response.getName())
                  .build())
          .get();

      System.out.println(
          "Created and ran a transfer job from "
              + azureSourceContainer
              + " to "
              + gcsSinkBucket
              + " with "
              + "name "
              + response.getName());
    }
  }
}

Node.js

如需了解如何安装和使用 Storage Transfer Service 客户端库,请参阅 Storage Transfer Service 客户端库。 如需了解详情,请参阅 Storage Transfer Service Node.js API 参考文档

如需向 Storage Transfer Service 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证


// Imports the Google Cloud client library
const {
  StorageTransferServiceClient,
} = require('@google-cloud/storage-transfer');

/**
 * TODO(developer): Uncomment the following lines before running the sample.
 */
// The ID of the Google Cloud Platform Project that owns the job
// projectId = 'my-project-id'

// A useful description for your transfer job
// description = 'My transfer job'

// Azure Storage Account name
// azureStorageAccount = 'accountname'

// Azure Storage Account name
// azureSourceContainer = 'my-azure-source-bucket'

// Azure Shared Access Signature token
// azureSASToken = '?sv=...'

// Google Cloud Storage destination bucket name
// gcsSinkBucket = 'my-gcs-destination-bucket'

// Creates a client
const client = new StorageTransferServiceClient();

/**
 * Creates a one-time transfer job from Azure Blob Storage to Google Cloud Storage.
 */
async function transferFromBlobStorage() {
  // Setting the start date and the end date as the same time creates a
  // one-time transfer
  const now = new Date();
  const oneTimeSchedule = {
    day: now.getDate(),
    month: now.getMonth() + 1,
    year: now.getFullYear(),
  };

  // Runs the request and creates the job
  const [transferJob] = await client.createTransferJob({
    transferJob: {
      projectId,
      description,
      status: 'ENABLED',
      schedule: {
        scheduleStartDate: oneTimeSchedule,
        scheduleEndDate: oneTimeSchedule,
      },
      transferSpec: {
        azureBlobStorageDataSource: {
          azureCredentials: {
            sasToken: azureSASToken,
          },
          container: azureSourceContainer,
          storageAccount: azureStorageAccount,
        },
        gcsDataSink: {
          bucketName: gcsSinkBucket,
        },
      },
    },
  });

  console.log(
    `Created and ran a transfer job from '${azureSourceContainer}' to '${gcsSinkBucket}' with name ${transferJob.name}`
  );
}

transferFromBlobStorage();

Python

如需了解如何安装和使用 Storage Transfer Service 客户端库,请参阅 Storage Transfer Service 客户端库。 如需了解详情,请参阅 Storage Transfer Service Python API 参考文档

如需向 Storage Transfer Service 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

from datetime import datetime

from google.cloud import storage_transfer

def create_one_time_azure_transfer(
    project_id: str,
    description: str,
    azure_storage_account: str,
    azure_sas_token: str,
    source_container: str,
    sink_bucket: str,
):
    """Creates a one-time transfer job from Azure Blob Storage to Google Cloud
    Storage."""

    # Initialize client that will be used to create storage transfer requests.
    # This client only needs to be created once, and can be reused for
    # multiple requests.
    client = storage_transfer.StorageTransferServiceClient()

    # The ID of the Google Cloud Platform Project that owns the job
    # project_id = 'my-project-id'

    # A useful description for your transfer job
    # description = 'My transfer job'

    # Azure Storage Account name
    # azure_storage_account = 'accountname'

    # Azure Shared Access Signature token
    # azure_sas_token = '?sv=...'

    # Azure Blob source container name
    # source_container = 'my-azure-source-bucket'

    # Google Cloud Storage destination bucket name
    # sink_bucket = 'my-gcs-destination-bucket'

    now = datetime.utcnow()
    # Setting the start date and the end date as
    # the same time creates a one-time transfer
    one_time_schedule = {"day": now.day, "month": now.month, "year": now.year}

    transfer_job_request = storage_transfer.CreateTransferJobRequest(
        {
            "transfer_job": {
                "project_id": project_id,
                "description": description,
                "status": storage_transfer.TransferJob.Status.ENABLED,
                "schedule": {
                    "schedule_start_date": one_time_schedule,
                    "schedule_end_date": one_time_schedule,
                },
                "transfer_spec": {
                    "azure_blob_storage_data_source": {
                        "storage_account": azure_storage_account,
                        "azure_credentials": {"sas_token": azure_sas_token},
                        "container": source_container,
                    },
                    "gcs_data_sink": {
                        "bucket_name": sink_bucket,
                    },
                },
            }
        }
    )

    result = client.create_transfer_job(transfer_job_request)
    print(f"Created transferJob: {result.name}")

从文件系统转移

本示例演示了如何将文件从 POSIX 文件系统移至 Cloud Storage 存储桶。

Go


import (
	"context"
	"fmt"
	"io"

	storagetransfer "cloud.google.com/go/storagetransfer/apiv1"
	"cloud.google.com/go/storagetransfer/apiv1/storagetransferpb"
)

func transferFromPosix(w io.Writer, projectID string, sourceAgentPoolName string, rootDirectory string, gcsSinkBucket string) (*storagetransferpb.TransferJob, error) {
	// Your project id
	// projectId := "myproject-id"

	// The agent pool associated with the POSIX data source. If not provided, defaults to the default agent
	// sourceAgentPoolName := "projects/my-project/agentPools/transfer_service_default"

	// The root directory path on the source filesystem
	// rootDirectory := "/directory/to/transfer/source"

	// The ID of the GCS bucket to transfer data to
	// gcsSinkBucket := "my-sink-bucket"

	ctx := context.Background()
	client, err := storagetransfer.NewClient(ctx)
	if err != nil {
		return nil, fmt.Errorf("storagetransfer.NewClient: %w", err)
	}
	defer client.Close()

	req := &storagetransferpb.CreateTransferJobRequest{
		TransferJob: &storagetransferpb.TransferJob{
			ProjectId: projectID,
			TransferSpec: &storagetransferpb.TransferSpec{
				SourceAgentPoolName: sourceAgentPoolName,
				DataSource: &storagetransferpb.TransferSpec_PosixDataSource{
					PosixDataSource: &storagetransferpb.PosixFilesystem{RootDirectory: rootDirectory},
				},
				DataSink: &storagetransferpb.TransferSpec_GcsDataSink{
					GcsDataSink: &storagetransferpb.GcsData{BucketName: gcsSinkBucket},
				},
			},
			Status: storagetransferpb.TransferJob_ENABLED,
		},
	}

	resp, err := client.CreateTransferJob(ctx, req)
	if err != nil {
		return nil, fmt.Errorf("failed to create transfer job: %w", err)
	}
	if _, err = client.RunTransferJob(ctx, &storagetransferpb.RunTransferJobRequest{
		ProjectId: projectID,
		JobName:   resp.Name,
	}); err != nil {
		return nil, fmt.Errorf("failed to run transfer job: %w", err)
	}
	fmt.Fprintf(w, "Created and ran transfer job from %v to %v with name %v", rootDirectory, gcsSinkBucket, resp.Name)
	return resp, nil
}

Java

import com.google.storagetransfer.v1.proto.StorageTransferServiceClient;
import com.google.storagetransfer.v1.proto.TransferProto;
import com.google.storagetransfer.v1.proto.TransferTypes.GcsData;
import com.google.storagetransfer.v1.proto.TransferTypes.PosixFilesystem;
import com.google.storagetransfer.v1.proto.TransferTypes.TransferJob;
import com.google.storagetransfer.v1.proto.TransferTypes.TransferSpec;
import java.io.IOException;

public class TransferFromPosix {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.

    // Your project id
    String projectId = "my-project-id";

    // The agent pool associated with the POSIX data source. If not provided, defaults to the
    // default agent
    String sourceAgentPoolName = "projects/my-project-id/agentPools/transfer_service_default";

    // The root directory path on the source filesystem
    String rootDirectory = "/directory/to/transfer/source";

    // The ID of the GCS bucket to transfer data to
    String gcsSinkBucket = "my-sink-bucket";

    transferFromPosix(projectId, sourceAgentPoolName, rootDirectory, gcsSinkBucket);
  }

  public static void transferFromPosix(
      String projectId, String sourceAgentPoolName, String rootDirectory, String gcsSinkBucket)
      throws IOException {
    TransferJob transferJob =
        TransferJob.newBuilder()
            .setProjectId(projectId)
            .setTransferSpec(
                TransferSpec.newBuilder()
                    .setSourceAgentPoolName(sourceAgentPoolName)
                    .setPosixDataSource(
                        PosixFilesystem.newBuilder().setRootDirectory(rootDirectory).build())
                    .setGcsDataSink(GcsData.newBuilder().setBucketName(gcsSinkBucket).build()))
            .setStatus(TransferJob.Status.ENABLED)
            .build();

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources,
    // or use "try-with-close" statement to do this automatically.
    try (StorageTransferServiceClient storageTransfer = StorageTransferServiceClient.create()) {

      // Create the transfer job
      TransferJob response =
          storageTransfer.createTransferJob(
              TransferProto.CreateTransferJobRequest.newBuilder()
                  .setTransferJob(transferJob)
                  .build());

      System.out.println(
          "Created a transfer job from "
              + rootDirectory
              + " to "
              + gcsSinkBucket
              + " with "
              + "name "
              + response.getName());
    }
  }
}

Node.js


// Imports the Google Cloud client library
const {
  StorageTransferServiceClient,
} = require('@google-cloud/storage-transfer');

/**
 * TODO(developer): Uncomment the following lines before running the sample.
 */
// Your project id
// const projectId = 'my-project'

// The agent pool associated with the POSIX data source. Defaults to the default agent
// const sourceAgentPoolName = 'projects/my-project/agentPools/transfer_service_default'

// The root directory path on the source filesystem
// const rootDirectory = '/directory/to/transfer/source'

// The ID of the GCS bucket to transfer data to
// const gcsSinkBucket = 'my-sink-bucket'

// Creates a client
const client = new StorageTransferServiceClient();

/**
 * Creates a request to transfer from the local file system to the sink bucket
 */
async function transferDirectory() {
  const createRequest = {
    transferJob: {
      projectId,
      transferSpec: {
        sourceAgentPoolName,
        posixDataSource: {
          rootDirectory,
        },
        gcsDataSink: {bucketName: gcsSinkBucket},
      },
      status: 'ENABLED',
    },
  };

  // Runs the request and creates the job
  const [transferJob] = await client.createTransferJob(createRequest);

  const runRequest = {
    jobName: transferJob.name,
    projectId: projectId,
  };

  await client.runTransferJob(runRequest);

  console.log(
    `Created and ran a transfer job from '${rootDirectory}' to '${gcsSinkBucket}' with name ${transferJob.name}`
  );
}

transferDirectory();

Python

from google.cloud import storage_transfer

def transfer_from_posix_to_gcs(
    project_id: str,
    description: str,
    source_agent_pool_name: str,
    root_directory: str,
    sink_bucket: str,
):
    """Create a transfer from a POSIX file system to a GCS bucket."""

    client = storage_transfer.StorageTransferServiceClient()

    # The ID of the Google Cloud Platform Project that owns the job
    # project_id = 'my-project-id'

    # A useful description for your transfer job
    # description = 'My transfer job'

    # The agent pool associated with the POSIX data source.
    # Defaults to 'projects/{project_id}/agentPools/transfer_service_default'
    # source_agent_pool_name = 'projects/my-project/agentPools/my-agent'

    # The root directory path on the source filesystem
    # root_directory = '/directory/to/transfer/source'

    # Google Cloud Storage sink bucket name
    # sink_bucket = 'my-gcs-sink-bucket'

    transfer_job_request = storage_transfer.CreateTransferJobRequest(
        {
            "transfer_job": {
                "project_id": project_id,
                "description": description,
                "status": storage_transfer.TransferJob.Status.ENABLED,
                "transfer_spec": {
                    "source_agent_pool_name": source_agent_pool_name,
                    "posix_data_source": {
                        "root_directory": root_directory,
                    },
                    "gcs_data_sink": {"bucket_name": sink_bucket},
                },
            }
        }
    )

    result = client.create_transfer_job(transfer_job_request)
    print(f"Created transferJob: {result.name}")