读取和写入 Cloud Storage

本文档介绍如何使用 Cloud Storage 客户端库存储和检索数据。本文档假定您已完成设置 Cloud Storage 中所列的任务,从而激活了 Cloud Storage 存储桶并下载了客户端库。本文档还假定您已了解如何构建 App Engine 应用。

如需查看其他代码示例,请参阅 Cloud Storage 客户端库

必要的导入

使用以下代码段和客户端库访问 Cloud Storage:

# Imports the Google Cloud client library
from google.cloud import storage

# Instantiates a client
storage_client = storage.Client()

# The name for the new bucket
bucket_name = "my-new-bucket"

# Creates the new bucket
bucket = storage_client.create_bucket(bucket_name)

print(f"Bucket {bucket.name} created.")

指定 Cloud Storage 存储分区

在 Cloud Storage 中执行任何操作之前,您需要提供存储分区名称。

# Imports the Google Cloud client library
from google.cloud import storage

# Instantiates a client
storage_client = storage.Client()

# The name for the new bucket
bucket_name = "my-new-bucket"

# Creates the new bucket
bucket = storage_client.create_bucket(bucket_name)

print(f"Bucket {bucket.name} created.")

指定存储桶名称的较简单方法是使用项目的默认存储桶。您必须先为项目创建默认存储分区,对 get_default_gcs_bucket_name 的调用才会成功。

写入 Cloud Storage

以下示例介绍如何写入存储桶:

from google.cloud import storage


def write_read(bucket_name, blob_name):
    """Write and read a blob from GCS using file-like IO"""
    # The ID of your GCS bucket
    # bucket_name = "your-bucket-name"

    # The ID of your new GCS object
    # blob_name = "storage-object-name"

    storage_client = storage.Client()
    bucket = storage_client.bucket(bucket_name)
    blob = bucket.blob(blob_name)

    # Mode can be specified as wb/rb for bytes mode.
    # See: https://docs.python.org/3/library/io.html
    with blob.open("w") as f:
        f.write("Hello world")

    with blob.open("r") as f:
        print(f.read())

请注意以下几点:

  • 在打开文件进行写入的调用中,该示例会指定特定 Cloud Storage 标头来写入该文件的自定义元数据;可以使用 cloudstorage.stat() 检索此元数据。您可以在 cloudstorage.open() 参考文档中找到受支持的标头列表。

  • 未设置 x-goog-acl 标头。这意味着,公共读取的默认 Cloud Storage ACL 将在写入存储桶时应用于该对象。

  • 完成写入后,请确保调用该函数来关闭文件。否则,不会将该文件写入 Cloud Storage。请注意,调用 Python 文件函数 close() 后,您无法再对文件执行附加操作。如果您需要修改文件,则必须调用 Python 文件函数 open() 以在写入模式下再次打开该文件,但这执行的是改写操作而不是附加操作。

从 Cloud Storage 读取

以下示例演示了如何从存储分区读取完整文件:

from google.cloud import storage


def write_read(bucket_name, blob_name):
    """Write and read a blob from GCS using file-like IO"""
    # The ID of your GCS bucket
    # bucket_name = "your-bucket-name"

    # The ID of your new GCS object
    # blob_name = "storage-object-name"

    storage_client = storage.Client()
    bucket = storage_client.bucket(bucket_name)
    blob = bucket.blob(blob_name)

    # Mode can be specified as wb/rb for bytes mode.
    # See: https://docs.python.org/3/library/io.html
    with blob.open("w") as f:
        f.write("Hello world")

    with blob.open("r") as f:
        print(f.read())

在这两个示例中,您传递给 cloudstorage.open()blob_name 参数是 YOUR_BUCKET_NAME/PATH_IN_GCS 格式的文件路径。请注意,cloudstorage.open() 的默认值是只读模式。打开文件进行读取时,您无需指定模式。

列出存储桶内容

该示例代码演示了如何对内容为 blob 类型的存储桶进行分页:

from google.cloud import storage


def list_blobs(bucket_name):
    """Lists all the blobs in the bucket."""
    # bucket_name = "your-bucket-name"

    storage_client = storage.Client()

    # Note: Client.list_blobs requires at least package version 1.17.0.
    blobs = storage_client.list_blobs(bucket_name)

    # Note: The call returns a response only when the iterator is consumed.
    for blob in blobs:
        print(blob.name)

请注意,完整的文件名显示为一个不含目录分隔符的字符串。如果要显示该文件,让其目录层次结构更易识别,请将 delimiter 参数设置为要使用的目录分隔符。

删除 Cloud Storage 中的文件

以下代码演示了如何使用 cloudstorage.delete() 方法(以 gcs 形式导入)删除 Cloud Storage 中的文件。

from google.cloud import storage


def delete_blob(bucket_name, blob_name):
    """Deletes a blob from the bucket."""
    # bucket_name = "your-bucket-name"
    # blob_name = "your-object-name"

    storage_client = storage.Client()

    bucket = storage_client.bucket(bucket_name)
    blob = bucket.blob(blob_name)
    generation_match_precondition = None

    # Optional: set a generation-match precondition to avoid potential race conditions
    # and data corruptions. The request to delete is aborted if the object's
    # generation number does not match your precondition.
    blob.reload()  # Fetch blob metadata to use in generation_match_precondition.
    generation_match_precondition = blob.generation

    blob.delete(if_generation_match=generation_match_precondition)

    print(f"Blob {blob_name} deleted.")

此示例演示了如何清理在向 Cloud Storage 中写入数据部分中写入存储分区的文件。

后续步骤