Composite Objects and Parallel Uploads

To support parallel uploads and limited append/edit functionality, Cloud Storage allows users to compose up to 32 existing objects into a new object without transferring additional object data.

Compose Operation

The compose operation creates a new object whose contents are the concatenation of a given sequence of up to 32 component objects under the same bucket. The components are unaffected by the process, and the resulting composite does not change if its components are replaced or deleted.

Composite objects may even be built from other existing composites, provided that the total component count does not exceed 1024. There is a per-project rate limit on how many components you can compose of approximately 200 components per second. This rate counts both the components being appended to a composite object as well as the components being copied when the composite object of which they are a part is copied.

Component Count Property

Each object maintains a component count property, which specifies the number of originally uploaded objects from which it was created. Composing a sequence of objects creates an object whose component count is equal to the sum of component counts of each composite object in the sequence plus 1 for each non-composite object in the sequence. For example, if you perform a Compose operation where the first 2 components are non-composite objects and the third component is a composite object with a Component Count of 12, the total Component Count for the Compose operation would be 14. Compose operations that yield Component Counts exceeding 1024 are not allowed. To add more objects to a composite object with a Component Count of 1024, you will first need to copy the composite object to a new object by downloading and then uploading the object. For an example, see Performing Object Composition with gsutil.

Integrity Checking Composite Objects

Cloud Storage uses CRC32C for integrity checking each component object at upload time, and for allowing the caller to perform an integrity check of the resulting composite object when it is downloaded. CRC32C is an error detecting code that can be efficiently calculated from the CRC32C values of its components. Your application should use CRC32C as follows:

  • When uploading component objects, you should calculate the CRC32C for each object using a CRC32C library such as one of those listed below, and include that value in your request.
  • For the compose operation, you should include a CRC32C in the request. Cloud Storage will respond with the CRC32C of the composite object. Cloud Storage will not calculate MD5 values for composite objects.
  • At download time, you should calculate the CRC32C of the downloaded object, and compare that with the value included in the response.
  • If your application could change component objects between the time of uploading and composing those objects, you should specify generation-specific names for the source objects to avoid race conditions.

Libraries for computing CRC32C values include Boost for C++, GoogleCloudPlatform crc32c for Java, crcmod for Python, and digest-crc for Ruby. Note also that CRC32C is supported in hardware in current Intel CPUs.

In the past, Cloud Storage used MD5 to construct the ETag value. This is not true for composite objects; client code should make no assumptions about composite object ETags except that they will change whenever the underlying object changes per the IETF specification for HTTP/1.1.

Parallel Uploads

Object composition can be used for uploading an object in parallel: simply divide your data into multiple chunks, upload each chunk to a distinct object in parallel, compose your final object, and delete any temporary objects.

In order to protect against changes to component objects between the upload and compose requests, users should provide an expected generation number for each component. For more information about object generations, see Generations and Preconditions.

Limited Append and Edit

You can also use the compose operation to accomplish limited object appends and edits.

Append is accomplished by uploading data to a temporary new object, composing the object you wish to append along with this new data, and deleting the temporary object. This functionality is limited by the Component Count Property described above.

You can also use composition to support a basic flavor of object editing. For example, you could compose an object X from the sequence {Y1, Y2, Y3}, replace the contents of Y2, and recompose X from those same components. Note that this requires that Y1, Y2, and Y3 be left undeleted, so you will be billed for those components as well as for the composite.

Performing Object Composition with gsutil

gsutil supports object composition with the compose command. For details, please see its built-in documentation by running:

gsutil help compose

For example, to compose three objects (component-obj-1, component-obj-2, component-obj-3) into one object (composite-object), you can use the following command:

gsutil compose gs://example-bucket/component-obj-1 gs://example-bucket/component-obj-2 gs://example-bucket/component-obj-3 gs://example-bucket/composite-object

Or, if the objects you are composing are the only ones with the prefix component-obj-, then you can also use a wildcard in the compose command as shown in the following example:

gsutil compose gs://example-bucket/component-obj-* gs://example-bucket/composite-object

After the compose operation, you can check the component count with the following command:

gsutil stat gs://example-bucket/composite-object

In this example, the component count is 3. However, if you had reached the component count of 1024 and you wanted to continue composing objects, you would need to copy the composite object to a new object by downloading and then uploading the object. You can do this using the "daisy chain" (-D) option of the cp command. If you omit the -D option, gsutil will copy the composite object "in the cloud," its component count will not change, and you will not be able to continue appending objects to it.

The following two commands copy composite-object to new-object and then move new-object back to use the original name. Both commands use the -p option of the cp command so that gsutil preserves object ACLs.

gsutil cp -D -p gs://example-bucket/composite-object gs://example-bucket/new-object
gsutil mv -p gs://example-bucket/new-object gs://example-bucket/composite-object

Performing Object Composition with the XML API

With the XML API, you compose objects by issuing a PUT object request with the compose query parameter, and including an XML body listing the component object names in order as shown in the example below.

PUT /example-bucket/composite-object?compose HTTP/1.1
Content-Length: 153
Authorization: Bearer ya29.AHES6ZRVmB7fkLtd1XTmq6mo0S1wqZZi3-Lh_s-6Uw7p8vtgSwg

No bucket is specified for the component objects because, as noted earlier, the source and destination objects must all be under the same bucket.

The example request above also specifies a generation number for component-obj-2, so this request will compose generation 1361471441094000 of the object even if that generation is no longer current.

The third component was supplied with a conditional generation using the IfGenerationMatch request element; this will cause the request to fail if the given generation number doesn't represent the component's current generation.

The response to the above object composition request would look like:

Server: HTTP Upload Server Built on Mar 6 2013 16:24:27 (1362615867)
ETag: "-CKicn4fknbUCEAE="
x-goog-generation: 1362768951202000
x-goog-metageneration: 1
x-goog-hash: crc32c=fbWtZQ==
x-goog-component-count: 3
Vary: Origin
Date: Fri, 08 Mar 2013 18:55:51 GMT
Pragma: no-cache
Expires: Mon, 01 Jan 1990 00:00:00 GMT
Cache-Control: no-cache, no-store, must-revalidate
Content-Length: 0
Content-Type: text/html; charset=UTF-8

The x-goog-hash header reports the object's CRC32C value, which can be validated by building a CRC32C value from the CRC32C values from which the object was composed.

The component count of the new composite object is the value of the x-goog-component-count response header.

Performing Object Composition with the JSON API

With the JSON API you compose objects by issuing a compose request with a JSON body listing the component object names in order, as shown in the example below.

POST /storage/v1/b/example-bucket/o/composite-object/compose
Content-Length: 216
Content-Type: application/json
Authorization: Bearer ya29.AHES6ZRVmB7fkLtd1XTmq6mo0S1wqZZi3-Lh_s-6Uw7p8vtgSwg
  "sourceObjects": [
      "name": "component-obj-1"
    { "name": "component-obj-2"
    { "name": "component-obj-3"
  "destination": {
   "contentType": "application/octet-stream"

The response to the above object composition request would include an object resource which includes the component count:

 "kind": "storage#object",
 "id": "bucket/composite-object/1388778813188000",
 "selfLink": "",
 "name": "composite-object",
 "bucket": "bucket",
 "generation": "1388778813188000",
 "metageneration": "1",
 "contentType": "application/octet-stream",
 "updated": "2014-01-03T19:53:33.188Z",
 "size": "524052",
 "mediaLink": "",
 "crc32c": "V9kcXg==",
 "componentCount": 3,
 "etag": "CKDP057k4rsCEAE="

Send feedback about...

Cloud Storage