Class Blob (1.39.0)

Blob(
    name,
    bucket,
    chunk_size=None,
    encryption_key=None,
    kms_key_name=None,
    generation=None,
)

A wrapper around Cloud Storage's concept of an Object.

Parameters

NameDescription
name str

The name of the blob. This corresponds to the unique path of the object in the bucket. If bytes, will be converted to a unicode object. Blob / object names can contain any sequence of valid unicode characters, of length 1-1024 bytes when UTF-8 encoded.

bucket Bucket

The bucket to which this blob belongs.

chunk_size int

(Optional) The size of a chunk of data whenever iterating (in bytes). This must be a multiple of 256 KB per the API specification. If not specified, the chunk_size of the blob itself is used. If that is not specified, a default value of 40 MB is used.

encryption_key bytes

(Optional) 32 byte encryption key for customer-supplied encryption. See https://cloud.google.com/storage/docs/encryption#customer-supplied.

kms_key_name str

(Optional) Resource name of Cloud KMS key used to encrypt the blob's contents.

generation long

(Optional) If present, selects a specific revision of this object.

Properties

acl

Create our ACL on demand.

bucket

Bucket which contains the object.

Returns
TypeDescription
BucketThe object's bucket.

cache_control

Scalar property getter.

chunk_size

Get the blob's default chunk size.

Returns
TypeDescription
int or NoneTypeThe current blob's chunk size, if it is set.

client

The client bound to this blob.

component_count

Number of underlying components that make up this object.

See https://cloud.google.com/storage/docs/json_api/v1/objects

Returns
TypeDescription
int or NoneTypeThe component count (in case of a composed object) or None if the blob's resource has not been loaded from the server. This property will not be set on objects not created via compose.

content_disposition

Scalar property getter.

content_encoding

Scalar property getter.

content_language

Scalar property getter.

content_type

Scalar property getter.

crc32c

Scalar property getter.

custom_time

Retrieve the custom time for the object.

See https://cloud.google.com/storage/docs/json_api/v1/objects

Returns
TypeDescription
datetime.datetime or NoneTypeDatetime object parsed from RFC3339 valid timestamp, or None if the blob's resource has not been loaded from the server (see reload).

encryption_key

Retrieve the customer-supplied encryption key for the object.

Returns
TypeDescription
bytes or NoneTypeThe encryption key or None if no customer-supplied encryption key was used, or the blob's resource has not been loaded from the server.

etag

Retrieve the ETag for the object.

See RFC 2616 (etags) and API reference docs.

Returns
TypeDescription
str or NoneTypeThe blob etag or None if the blob's resource has not been loaded from the server. .. _RFC 2616 (etags): https://tools.ietf.org/html/rfc2616#section-3.11

event_based_hold

Scalar property getter.

generation

Retrieve the generation for the object.

See https://cloud.google.com/storage/docs/json_api/v1/objects

Returns
TypeDescription
int or NoneTypeThe generation of the blob or None if the blob's resource has not been loaded from the server.

Methods

Blob

Blob(
    name,
    bucket,
    chunk_size=None,
    encryption_key=None,
    kms_key_name=None,
    generation=None,
)

property name Get the blob's name.

compose

compose(sources, client=None, timeout=60, if_generation_match=None, if_metageneration_match=None, if_source_generation_match=None, retry=<google.cloud.storage.retry.ConditionalRetryPolicy object>)

Concatenate source blobs into this one.

If user_project is set on the bucket, bills the API request to that project.

Parameters
NameDescription
sources list of Blob

Blobs whose contents will be composed into this blob.

client Client

(Optional) The client to use. If not passed, falls back to the client stored on the blob's bucket.

timeout float or tuple

(Optional) The amount of time, in seconds, to wait for the server response. See: configuring_timeouts

if_generation_match long

(Optional) Makes the operation conditional on whether the destination object's current generation matches the given value. Setting to 0 makes the operation succeed only if there are no live versions of the object. .. note:: In a previous version, this argument worked identically to the if_source_generation_match argument. For backwards-compatibility reasons, if a list is passed in, this argument will behave like if_source_generation_match and also issue a DeprecationWarning.

if_metageneration_match long

(Optional) Makes the operation conditional on whether the destination object's current metageneration matches the given value. If a list of long is passed in, no match operation will be performed. (Deprecated: type(list of long) is supported for backwards-compatability reasons only.)

if_source_generation_match list of long

(Optional) Makes the operation conditional on whether the current generation of each source blob matches the corresponding generation. The list must match sources item-to-item.

retry google.api_core.retry.Retry or google.cloud.storage.retry.ConditionalRetryPolicy

(Optional) How to retry the RPC. See: configuring_retries .. rubric:: Example Compose blobs using source generation match preconditions. >>> from google.cloud import storage >>> client = storage.Client() >>> bucket = client.bucket("bucket-name") >>> blobs = [bucket.blob("blob-name-1"), bucket.blob("blob-name-2")] >>> if_source_generation_match = [None] * len(blobs) >>> if_source_generation_match[0] = "123" # precondition for "blob-name-1" >>> composed_blob = bucket.blob("composed-name") >>> composed_blob.compose(blobs, if_source_generation_match=if_source_generation_match)

create_resumable_upload_session

create_resumable_upload_session(
    content_type=None, size=None, origin=None, client=None, timeout=60, checksum=None
)

Create a resumable upload session.

Resumable upload sessions allow you to start an upload session from one client and complete the session in another. This method is called by the initiator to set the metadata and limits. The initiator then passes the session URL to the client that will upload the binary data. The client performs a PUT request on the session URL to complete the upload. This process allows untrusted clients to upload to an access-controlled bucket. For more details, see the documentation on signed URLs_.

.. _documentation on signed URLs: https://cloud.google.com/storage/ docs/access-control/signed-urls#signing-resumable

The content type of the upload will be determined in order of precedence:

  • The value passed in to this method (if not :data:None)
  • The value stored on the current blob
  • The default value ('application/octet-stream')

See the object versioning <https://cloud.google.com/storage/docs/object-versioning> and lifecycle <https://cloud.google.com/storage/docs/lifecycle> API documents for details.

If encryption_key is set, the blob will be encrypted with a customer-supplied_ encryption key.

If user_project is set on the bucket, bills the API request to that project.

Parameters
NameDescription
size int

(Optional) The maximum number of bytes that can be uploaded using this session. If the size is not known when creating the session, this should be left blank.

content_type str

(Optional) Type of content being uploaded.

origin str

(Optional) If set, the upload can only be completed by a user-agent that uploads from the given origin. This can be useful when passing the session to a web client.

client Client

(Optional) The client to use. If not passed, falls back to the client stored on the blob's bucket.

timeout float or tuple

(Optional) The amount of time, in seconds, to wait for the server response. See: configuring_timeouts

checksum str

(Optional) The type of checksum to compute to verify the integrity of the object. After the upload is complete, the server-computed checksum of the resulting object will be checked and google.resumable_media.common.DataCorruption will be raised on a mismatch. On a validation failure, the client will attempt to delete the uploaded object automatically. Supported values are "md5", "crc32c" and None. The default is None.

Exceptions
TypeDescription
GoogleCloudErrorif the session creation response returns an error status.
Returns
TypeDescription
strThe resumable upload session URL. The upload can be completed by making an HTTP PUT request with the file's contents.

delete

delete(client=None, if_generation_match=None, if_generation_not_match=None, if_metageneration_match=None, if_metageneration_not_match=None, timeout=60, retry=<google.cloud.storage.retry.ConditionalRetryPolicy object>)

Deletes a blob from Cloud Storage.

If user_project is set on the bucket, bills the API request to that project.

Parameters
NameDescription
client Client

(Optional) The client to use. If not passed, falls back to the client stored on the blob's bucket.

if_generation_match long

(Optional) See :ref:using-if-generation-match

if_generation_not_match long

(Optional) See :ref:using-if-generation-not-match

if_metageneration_match long

(Optional) See :ref:using-if-metageneration-match

if_metageneration_not_match long

(Optional) See :ref:using-if-metageneration-not-match

timeout float or tuple

(Optional) The amount of time, in seconds, to wait for the server response. See: configuring_timeouts

retry google.api_core.retry.Retry or google.cloud.storage.retry.ConditionalRetryPolicy

(Optional) How to retry the RPC. See: configuring_retries

Exceptions
TypeDescription
NotFound(propagated from delete_blob).

download_as_bytes

download_as_bytes(client=None, start=None, end=None, raw_download=False, if_generation_match=None, if_generation_not_match=None, if_metageneration_match=None, if_metageneration_not_match=None, timeout=60, checksum='md5', retry=<google.api_core.retry.Retry object>)

Download the contents of this blob as a bytes object.

If user_project is set on the bucket, bills the API request to that project.

Parameters
NameDescription
client Client

(Optional) The client to use. If not passed, falls back to the client stored on the blob's bucket.

start int

(Optional) The first byte in a range to be downloaded.

end int

(Optional) The last byte in a range to be downloaded.

raw_download bool

(Optional) If true, download the object without any expansion.

if_generation_match long

(Optional) See :ref:using-if-generation-match

if_generation_not_match long

(Optional) See :ref:using-if-generation-not-match

if_metageneration_match long

(Optional) See :ref:using-if-metageneration-match

if_metageneration_not_match long

(Optional) See :ref:using-if-metageneration-not-match

timeout float or tuple

(Optional) The amount of time, in seconds, to wait for the server response. See: configuring_timeouts

checksum str

(Optional) The type of checksum to compute to verify the integrity of the object. The response headers must contain a checksum of the requested type. If the headers lack an appropriate checksum (for instance in the case of transcoded or ranged downloads where the remote service does not know the correct checksum, including downloads where chunk_size is set) an INFO-level log will be emitted. Supported values are "md5", "crc32c" and None. The default is "md5".

retry google.api_core.retry.Retry or google.cloud.storage.retry.ConditionalRetryPolicy

(Optional) How to retry the RPC. A None value will disable retries. A google.api_core.retry.Retry value will enable retries, and the object will define retriable response codes and errors and configure backoff and timeout options. A google.cloud.storage.retry.ConditionalRetryPolicy value wraps a Retry object and activates it only if certain conditions are met. This class exists to provide safe defaults for RPC calls that are not technically safe to retry normally (due to potential data duplication or other side-effects) but become safe to retry if a condition such as if_metageneration_match is set. See the retry.py source code and docstrings in this package (google.cloud.storage.retry) for information on retry types and how to configure them. Media operations (downloads and uploads) do not support non-default predicates in a Retry object. The default will always be used. Other configuration changes for Retry objects such as delays and deadlines are respected.

Exceptions
TypeDescription
NotFound
Returns
TypeDescription
bytesThe data stored in this blob.

download_as_string

download_as_string(client=None, start=None, end=None, raw_download=False, if_generation_match=None, if_generation_not_match=None, if_metageneration_match=None, if_metageneration_not_match=None, timeout=60, retry=<google.api_core.retry.Retry object>)

(Deprecated) Download the contents of this blob as a bytes object.

If user_project is set on the bucket, bills the API request to that project.

Parameters
NameDescription
client Client

(Optional) The client to use. If not passed, falls back to the client stored on the blob's bucket.

start int

(Optional) The first byte in a range to be downloaded.

end int

(Optional) The last byte in a range to be downloaded.

raw_download bool

(Optional) If true, download the object without any expansion.

if_generation_match long

(Optional) See :ref:using-if-generation-match

if_generation_not_match long

(Optional) See :ref:using-if-generation-not-match

if_metageneration_match long

(Optional) See :ref:using-if-metageneration-match

if_metageneration_not_match long

(Optional) See :ref:using-if-metageneration-not-match

timeout float or tuple

(Optional) The amount of time, in seconds, to wait for the server response. See: configuring_timeouts

retry google.api_core.retry.Retry or google.cloud.storage.retry.ConditionalRetryPolicy

(Optional) How to retry the RPC. A None value will disable retries. A google.api_core.retry.Retry value will enable retries, and the object will define retriable response codes and errors and configure backoff and timeout options. A google.cloud.storage.retry.ConditionalRetryPolicy value wraps a Retry object and activates it only if certain conditions are met. This class exists to provide safe defaults for RPC calls that are not technically safe to retry normally (due to potential data duplication or other side-effects) but become safe to retry if a condition such as if_metageneration_match is set. See the retry.py source code and docstrings in this package (google.cloud.storage.retry) for information on retry types and how to configure them. Media operations (downloads and uploads) do not support non-default predicates in a Retry object. The default will always be used. Other configuration changes for Retry objects such as delays and deadlines are respected.

Exceptions
TypeDescription
NotFound
Returns
TypeDescription
bytesThe data stored in this blob.

download_as_text

download_as_text(client=None, start=None, end=None, raw_download=False, encoding=None, if_generation_match=None, if_generation_not_match=None, if_metageneration_match=None, if_metageneration_not_match=None, timeout=60, retry=<google.api_core.retry.Retry object>)

Download the contents of this blob as text (not bytes).

If user_project is set on the bucket, bills the API request to that project.

Parameters
NameDescription
client Client

(Optional) The client to use. If not passed, falls back to the client stored on the blob's bucket.

start int

(Optional) The first byte in a range to be downloaded.

end int

(Optional) The last byte in a range to be downloaded.

raw_download bool

(Optional) If true, download the object without any expansion.

encoding str

(Optional) encoding to be used to decode the downloaded bytes. Defaults to the charset param of attr:content_type, or else to "utf-8".

if_generation_match long

(Optional) See :ref:using-if-generation-match

if_generation_not_match long

(Optional) See :ref:using-if-generation-not-match

if_metageneration_match long

(Optional) See :ref:using-if-metageneration-match

if_metageneration_not_match long

(Optional) See :ref:using-if-metageneration-not-match

timeout float or tuple

(Optional) The amount of time, in seconds, to wait for the server response. See: configuring_timeouts

retry google.api_core.retry.Retry or google.cloud.storage.retry.ConditionalRetryPolicy

(Optional) How to retry the RPC. A None value will disable retries. A google.api_core.retry.Retry value will enable retries, and the object will define retriable response codes and errors and configure backoff and timeout options. A google.cloud.storage.retry.ConditionalRetryPolicy value wraps a Retry object and activates it only if certain conditions are met. This class exists to provide safe defaults for RPC calls that are not technically safe to retry normally (due to potential data duplication or other side-effects) but become safe to retry if a condition such as if_metageneration_match is set. See the retry.py source code and docstrings in this package (google.cloud.storage.retry) for information on retry types and how to configure them. Media operations (downloads and uploads) do not support non-default predicates in a Retry object. The default will always be used. Other configuration changes for Retry objects such as delays and deadlines are respected.

Returns
TypeDescription
textThe data stored in this blob, decoded to text.

download_to_file

download_to_file(file_obj, client=None, start=None, end=None, raw_download=False, if_generation_match=None, if_generation_not_match=None, if_metageneration_match=None, if_metageneration_not_match=None, timeout=60, checksum='md5', retry=<google.api_core.retry.Retry object>)

DEPRECATED. Download the contents of this blob into a file-like object.

.. literalinclude:: snippets.py :start-after: [START download_to_file] :end-before: [END download_to_file] :dedent: 4

The encryption_key should be a str or bytes with a length of at least 32.

If the chunk_size of a current blob is None, will download data in single download request otherwise it will download the chunk_size of data in each request.

For more fine-grained control over the download process, check out google-resumable-media_. For example, this library allows downloading parts of a blob rather than the whole thing.

If user_project is set on the bucket, bills the API request to that project.

Parameters
NameDescription
file_obj file

A file handle to which to write the blob's data.

client Client

(Optional) The client to use. If not passed, falls back to the client stored on the blob's bucket.

start int

(Optional) The first byte in a range to be downloaded.

end int

(Optional) The last byte in a range to be downloaded.

raw_download bool

(Optional) If true, download the object without any expansion.

if_generation_match long

(Optional) See :ref:using-if-generation-match

if_generation_not_match long

(Optional) See :ref:using-if-generation-not-match

if_metageneration_match long

(Optional) See :ref:using-if-metageneration-match

if_metageneration_not_match long

(Optional) See :ref:using-if-metageneration-not-match

timeout float or tuple

(Optional) The amount of time, in seconds, to wait for the server response. See: configuring_timeouts

checksum str

(Optional) The type of checksum to compute to verify the integrity of the object. The response headers must contain a checksum of the requested type. If the headers lack an appropriate checksum (for instance in the case of transcoded or ranged downloads where the remote service does not know the correct checksum, including downloads where chunk_size is set) an INFO-level log will be emitted. Supported values are "md5", "crc32c" and None. The default is "md5".

retry google.api_core.retry.Retry or google.cloud.storage.retry.ConditionalRetryPolicy

(Optional) How to retry the RPC. A None value will disable retries. A google.api_core.retry.Retry value will enable retries, and the object will define retriable response codes and errors and configure backoff and timeout options. A google.cloud.storage.retry.ConditionalRetryPolicy value wraps a Retry object and activates it only if certain conditions are met. This class exists to provide safe defaults for RPC calls that are not technically safe to retry normally (due to potential data duplication or other side-effects) but become safe to retry if a condition such as if_metageneration_match is set. See the retry.py source code and docstrings in this package (google.cloud.storage.retry) for information on retry types and how to configure them. Media operations (downloads and uploads) do not support non-default predicates in a Retry object. The default will always be used. Other configuration changes for Retry objects such as delays and deadlines are respected.

Exceptions
TypeDescription
NotFound

download_to_filename

download_to_filename(filename, client=None, start=None, end=None, raw_download=False, if_generation_match=None, if_generation_not_match=None, if_metageneration_match=None, if_metageneration_not_match=None, timeout=60, checksum='md5', retry=<google.api_core.retry.Retry object>)

Download the contents of this blob into a named file.

If user_project is set on the bucket, bills the API request to that project.

Parameters
NameDescription
filename str

A filename to be passed to open.

client Client

(Optional) The client to use. If not passed, falls back to the client stored on the blob's bucket.

start int

(Optional) The first byte in a range to be downloaded.

end int

(Optional) The last byte in a range to be downloaded.

raw_download bool

(Optional) If true, download the object without any expansion.

if_generation_match long

(Optional) See :ref:using-if-generation-match

if_generation_not_match long

(Optional) See :ref:using-if-generation-not-match

if_metageneration_match long

(Optional) See :ref:using-if-metageneration-match

if_metageneration_not_match long

(Optional) See :ref:using-if-metageneration-not-match

timeout float or tuple

(Optional) The amount of time, in seconds, to wait for the server response. See: configuring_timeouts

checksum str

(Optional) The type of checksum to compute to verify the integrity of the object. The response headers must contain a checksum of the requested type. If the headers lack an appropriate checksum (for instance in the case of transcoded or ranged downloads where the remote service does not know the correct checksum, including downloads where chunk_size is set) an INFO-level log will be emitted. Supported values are "md5", "crc32c" and None. The default is "md5".

retry google.api_core.retry.Retry or google.cloud.storage.retry.ConditionalRetryPolicy

(Optional) How to retry the RPC. A None value will disable retries. A google.api_core.retry.Retry value will enable retries, and the object will define retriable response codes and errors and configure backoff and timeout options. A google.cloud.storage.retry.ConditionalRetryPolicy value wraps a Retry object and activates it only if certain conditions are met. This class exists to provide safe defaults for RPC calls that are not technically safe to retry normally (due to potential data duplication or other side-effects) but become safe to retry if a condition such as if_metageneration_match is set. See the retry.py source code and docstrings in this package (google.cloud.storage.retry) for information on retry types and how to configure them. Media operations (downloads and uploads) do not support non-default predicates in a Retry object. The default will always be used. Other configuration changes for Retry objects such as delays and deadlines are respected.

Exceptions
TypeDescription
NotFound

exists

exists(client=None, if_generation_match=None, if_generation_not_match=None, if_metageneration_match=None, if_metageneration_not_match=None, timeout=60, retry=<google.api_core.retry.Retry object>)

Determines whether or not this blob exists.

If user_project is set on the bucket, bills the API request to that project.

Parameters
NameDescription
client Client

(Optional) The client to use. If not passed, falls back to the client stored on the blob's bucket.

if_generation_match long

(Optional) See :ref:using-if-generation-match

if_generation_not_match long

(Optional) See :ref:using-if-generation-not-match

if_metageneration_match long

(Optional) See :ref:using-if-metageneration-match

if_metageneration_not_match long

(Optional) See :ref:using-if-metageneration-not-match

timeout float or tuple

(Optional) The amount of time, in seconds, to wait for the server response. See: configuring_timeouts

retry google.api_core.retry.Retry or google.cloud.storage.retry.ConditionalRetryPolicy

(Optional) How to retry the RPC. See: configuring_retries

Returns
TypeDescription
boolTrue if the blob exists in Cloud Storage.

from_string

from_string(uri, client=None)

Get a constructor for blob object by URI.

Parameters
NameDescription
uri str

The blob uri pass to get blob object.

client Client

(Optional) The client to use. If not passed, falls back to the client stored on the blob's bucket.

Returns
TypeDescription
BlobThe blob object created. .. rubric:: Example Get a constructor for blob object by URI. >>> from google.cloud import storage >>> from google.cloud.storage.blob import Blob >>> client = storage.Client() >>> blob = Blob.from_string("gs://bucket/object")

generate_signed_url

generate_signed_url(
    expiration=None,
    api_access_endpoint="https://storage.googleapis.com",
    method="GET",
    content_md5=None,
    content_type=None,
    response_disposition=None,
    response_type=None,
    generation=None,
    headers=None,
    query_parameters=None,
    client=None,
    credentials=None,
    version=None,
    service_account_email=None,
    access_token=None,
    virtual_hosted_style=False,
    bucket_bound_hostname=None,
    scheme="http",
)

Generates a signed URL for this blob.

If you have a blob that you want to allow access to for a set amount of time, you can use this method to generate a URL that is only valid within a certain time period.

If bucket_bound_hostname is set as an argument of api_access_endpoint, https works only if using a CDN.

.. rubric:: Example

Generates a signed URL for this blob using bucket_bound_hostname and scheme.

from google.cloud import storage client = storage.Client() bucket = client.get_bucket('my-bucket-name') blob = bucket.get_blob('my-blob-name') url = blob.generate_signed_url(expiration='url-expiration-time', bucket_bound_hostname='mydomain.tld', version='v4') url = blob.generate_signed_url(expiration='url-expiration-time', bucket_bound_hostname='mydomain.tld', version='v4',scheme='https') # If using CDN

This is particularly useful if you don't want publicly accessible blobs, but don't want to require users to explicitly log in.

Parameters
NameDescription
expiration Union[Integer, datetime.datetime, datetime.timedelta]

Point in time when the signed URL should expire. If a datetime instance is passed without an explicit tzinfo set, it will be assumed to be UTC.

api_access_endpoint str

(Optional) URI base.

method str

The HTTP verb that will be used when requesting the URL.

content_md5 str

(Optional) The MD5 hash of the object referenced by resource.

content_type str

(Optional) The content type of the object referenced by resource.

response_disposition str

(Optional) Content disposition of responses to requests for the signed URL. For example, to enable the signed URL to initiate a file of blog.png, use the value 'attachment; filename=blob.png'.

response_type str

(Optional) Content type of responses to requests for the signed URL. Ignored if content_type is set on object/blob metadata.

generation str

(Optional) A value that indicates which generation of the resource to fetch.

headers dict

(Optional) Additional HTTP headers to be included as part of the signed URLs. See: https://cloud.google.com/storage/docs/xml-api/reference-headers Requests using the signed URL must pass the specified header (name and value) with each request for the URL.

query_parameters dict

(Optional) Additional query parameters to be included as part of the signed URLs. See: https://cloud.google.com/storage/docs/xml-api/reference-headers#query

client Client

(Optional) The client to use. If not passed, falls back to the client stored on the blob's bucket.

credentials google.auth.credentials.Credentials

(Optional) The authorization credentials to attach to requests. These credentials identify this application to the service. If none are specified, the client will attempt to ascertain the credentials from the environment.

version str

(Optional) The version of signed credential to create. Must be one of 'v2' 'v4'.

service_account_email str

(Optional) E-mail address of the service account.

access_token str

(Optional) Access token for a service account.

virtual_hosted_style bool

(Optional) If true, then construct the URL relative the bucket's virtual hostname, e.g., '

bucket_bound_hostname str

(Optional) If passed, then construct the URL relative to the bucket-bound hostname. Value can be a bare or with scheme, e.g., 'example.com' or 'http://example.com'. See: https://cloud.google.com/storage/docs/request-endpoints#cname

scheme str

(Optional) If bucket_bound_hostname is passed as a bare hostname, use this value as the scheme. https will work only when using a CDN. Defaults to "http".

Exceptions
TypeDescription
`ValueErrorwhen version is invalid.
`TypeErrorwhen expiration is not a valid type.
`AttributeErrorif credentials is not an instance of google.auth.credentials.Signing.
Returns
TypeDescription
strA signed URL you can use to access the resource until expiration.