Composite objects and parallel uploads

To support parallel uploads and limited append/edit functionality, Cloud Storage allows users to compose up to 32 existing objects into a new object without transferring additional object data.

Compose Operation

The compose operation creates a new object whose contents are the concatenation of a given sequence of up to 32 source objects. The source objects must all be the same storage class and all be stored in the same bucket. The source objects can themselves be composite objects.

The source objects are unaffected by the composition process, and the resulting composite object does not change if the source objects are replaced or deleted.

Component Count Property

Each object maintains a component count property, which specifies the number of originally uploaded objects from which it was created. Composing a sequence of objects creates an object whose component count is equal to the sum of component counts of each composite object in the sequence plus 1 for each non-composite object in the sequence. For example, if you perform a Compose operation where the first 2 components are non-composite objects and the third component is a composite object with a Component Count of 12, the total Component Count for the Compose operation would be 14.

While there is no limit to the number of components that a composite object can contain, the componentCount metadata property of an object saturates at 2,147,483,647. For example, say you have an object that contains 3,000,000,000 components. In this case, the componentCount for the object has a value of 2,147,483,647.

Integrity Checking Composite Objects

Cloud Storage uses CRC32C for integrity checking each component object at upload time, and for allowing the caller to perform an integrity check of the resulting composite object when it is downloaded. CRC32C is an error detecting code that can be efficiently calculated from the CRC32C values of its components. Your application should use CRC32C as follows:

  • When uploading component objects, you should calculate the CRC32C for each object using a CRC32C library such as one of those listed below, and include that value in your request.
  • For the compose operation, you should include a CRC32C in the request. Cloud Storage will respond with the CRC32C of the composite object. Cloud Storage will not calculate MD5 values for composite objects.
  • At download time, you should calculate the CRC32C of the downloaded object, and compare that with the value included in the response.
  • If your application could change component objects between the time of uploading and composing those objects, you should specify generation-specific names for the source objects to avoid race conditions.

Libraries for computing CRC32C values include Boost for C++, GoogleCloudPlatform crc32c for Java, crcmod for Python, and digest-crc for Ruby. Note also that CRC32C is supported in hardware in current Intel CPUs.

In the past, Cloud Storage used MD5 to construct the ETag value. This is not true for composite objects; client code should make no assumptions about composite object ETags except that they will change whenever the underlying object changes per the IETF specification for HTTP/1.1.

Parallel Uploads

Object composition can be used for uploading an object in parallel: simply divide your data into multiple chunks, upload each chunk to a distinct object in parallel, compose your final object, and delete any temporary objects.

In order to protect against changes to component objects between the upload and compose requests, users should provide an expected generation number for each component. For more information about object generations, see Generations and Preconditions.

Limited Append and Edit

You can also use the compose operation to accomplish limited object appends and edits.

Append is accomplished by uploading data to a temporary new object, composing the object you wish to append along with this new data, optionally naming the output of the compose operation the same as the original object, and deleting the temporary object.

You can also use composition to support a basic flavor of object editing. For example, you could compose an object X from the sequence {Y1, Y2, Y3}, replace the contents of Y2, and recompose X from those same components. Note that this requires that Y1, Y2, and Y3 be left undeleted, so you will be billed for those components as well as for the composite.

Performing Object Composition with gsutil

gsutil supports object composition with the compose command. For details, please see its built-in documentation by running:

gsutil help compose

For example, to compose three objects (component-obj-1, component-obj-2, component-obj-3) into one object (composite-object), you can use the following command:

gsutil compose gs://example-bucket/component-obj-1 gs://example-bucket/component-obj-2 gs://example-bucket/component-obj-3 gs://example-bucket/composite-object

Or, if the objects you are composing are the only ones with the prefix component-obj-, then you can also use a wildcard in the compose command as shown in the following example:

gsutil compose gs://example-bucket/component-obj-* gs://example-bucket/composite-object

After the compose operation, you can check the component count with the following command:

gsutil stat gs://example-bucket/composite-object

In this example, the component count is 3.

The following two commands copy composite-object to new-object and then move new-object back to use the original name. Both commands use the -p option of the cp command so that gsutil preserves object ACLs.

gsutil cp -D -p gs://example-bucket/composite-object gs://example-bucket/new-object
gsutil mv -p gs://example-bucket/new-object gs://example-bucket/composite-object

Performing Object Composition with the XML API

With the XML API, you compose objects by issuing a PUT object request with the compose query parameter, and including an XML body listing the component object names in order as shown in the example below.

PUT /example-bucket/composite-object?compose HTTP/1.1
Content-Length: 153
Authorization: Bearer ya29.AHES6ZRVmB7fkLtd1XTmq6mo0S1wqZZi3-Lh_s-6Uw7p8vtgSwg

No bucket is specified for the component objects because, as noted earlier, the source and destination objects must all be under the same bucket.

The example request above also specifies a generation number for component-obj-2, so this request will compose generation 1361471441094000 of the object even if that generation is no longer current.

The third component was supplied with a conditional generation using the IfGenerationMatch request element; this will cause the request to fail if the given generation number doesn't represent the component's current generation.

The response to the above object composition request would look like:

Server: HTTP Upload Server Built on Mar 6 2013 16:24:27 (1362615867)
ETag: "-CKicn4fknbUCEAE="
x-goog-generation: 1362768951202000
x-goog-metageneration: 1
x-goog-hash: crc32c=fbWtZQ==
x-goog-component-count: 3
Vary: Origin
Date: Fri, 08 Mar 2013 18:55:51 GMT
Pragma: no-cache
Expires: Mon, 01 Jan 1990 00:00:00 GMT
Cache-Control: no-cache, no-store, must-revalidate
Content-Length: 0
Content-Type: text/html; charset=UTF-8

The x-goog-hash header reports the object's CRC32C value, which can be validated by building a CRC32C value from the CRC32C values from which the object was composed.

The component count of the new composite object is the value of the x-goog-component-count response header.

Performing Object Composition with the JSON API

With the JSON API you compose objects by issuing a compose request with a JSON body listing the component object names in order, as shown in the example below.

POST /storage/v1/b/example-bucket/o/composite-object/compose
Content-Length: 216
Content-Type: application/json
Authorization: Bearer ya29.AHES6ZRVmB7fkLtd1XTmq6mo0S1wqZZi3-Lh_s-6Uw7p8vtgSwg
  "sourceObjects": [
      "name": "component-obj-1"
    { "name": "component-obj-2"
    { "name": "component-obj-3"
  "destination": {
   "contentType": "application/octet-stream"

The response to the above object composition request would include an object resource which includes the component count:

 "kind": "storage#object",
 "id": "bucket/composite-object/1388778813188000",
 "selfLink": "",
 "name": "composite-object",
 "bucket": "bucket",
 "generation": "1388778813188000",
 "metageneration": "1",
 "contentType": "application/octet-stream",
 "updated": "2014-01-03T19:53:33.188Z",
 "size": "524052",
 "mediaLink": "",
 "crc32c": "V9kcXg==",
 "componentCount": 3,
 "etag": "CKDP057k4rsCEAE="

هل كانت هذه الصفحة مفيدة؟ يرجى تقييم أدائنا:

إرسال تعليقات حول...

هل تحتاج إلى مساعدة؟ انتقل إلى صفحة الدعم.