Object generation numbers enable users to uniquely identify data resources and apply preconditions to guarantee atomicity of their multi-step transactions.
Generations
Even without Object Versioning enabled, all Cloud Storage objects have generation numbers and metageneration numbers. The generation number changes each time the object is replaced, and the metageneration number changes each time the object's metadata is updated.
Since object metageneration numbers reset to one for each new object generation, they are meaningful only when paired with a generation number.
Buckets also maintain a metageneration number enabling users to uniquely identify a bucket metadata state. Since buckets have no payload data, and thus no generation numbers, their metageneration numbers are meaningful on their own.
Example: parallel upload
In parallel uploads, you divide an object into multiple pieces, upload
the pieces to a temporary location simultaneously, and compose
the original
object from these temporary pieces. If an independent process inadvertently uses
the same name as one or more of the temporary pieces you've uploaded, then when
you attempt to compose
the object, incorrect components get used and the
object becomes corrupted.
By using generation numbers, you prevent this corruption from
happening. If you include the generation number of each uploaded piece when
you make your compose
request, either the compose
occurs with the correct
pieces or the request fails with a 404 Not Found
response.
Preconditions
Preconditions tell Cloud Storage to only perform a request if the generation or metageneration number of the affected object meets your precondition criteria. These checks of the generation and metageneration numbers ensure that the object is in the expected state, allowing you to perform safe read-modify-write updates and conditional operations on objects.
When a match
precondition uses a specific generation or metageneration
number, the Cloud Storage object to which the request applies must have the
same generation/metageneration number. If it does, the request succeeds. If
it does not, the request fails and a 412 Precondition Failed
response is
returned.
When a match
precondition uses the value 0
instead of a generation number,
the request only succeeds if there are no live objects in the Cloud Storage
bucket with the name specified in the request. If there is such an
object, the request fails and a 412 Precondition Failed
response is returned.
Preconditions are often used in mutating requests — uploads, deletes, copies, or metadata updates — to prevent race conditions. Race conditions can arise when the same request is sent repeatedly or when independent processes interfere with each other. For example, multiple request retries after a network interruption, or users performing a read-modify-write operation on the same object can create race conditions.
In addition to preconditions that use generation and metageneration numbers, there are also preconditions available that use ETags. XML API ETags for non-composite objects change only when content changes while ETags for composite objects and JSON API resources change whenever the content or metadata changes. For more information about ETags, see Hashes and ETags: Best Practices.
Cost of preconditions
Preconditions come at a performance and billing cost: for each mutating
operation, you also issue a billable GET
metadata request to determine the
generation/metageneration number of the object. As a performance consideration,
preconditions can potentially double the network portion of the overall
operation latency by adding an extra round trip, which may be an important
factor in latency-sensitive operations. As a pricing consideration, the GET
metadata request that allows you to use preconditions is billed at a rate of
$0.004 per 10,000 operations.
Depending on your application, there are ways to avoid performance and billing costs associated with using preconditions, such as:
- Storing the generation and metageneration numbers of your objects locally so that you already know the correct numbers to use in your precondition.
- Using a naming scheme that avoids more than one mutation of the same object name so that you don't need to use preconditions.
- Having application knowledge of which objects are newly created, so you
already know when to use the
if-generation-match:0
precondition. - Remembering the results of
GET
calls performed prior to mutations.
Preconditions in the XML API
In the XML API, generation and metageneration numbers are exposed via the
x-goog-generation
and x-goog-metageneration
response headers.
These headers are returned in the response of a HEAD
request
for an object.
See the HTTP Headers Reference for a complete listing of precondition request headers that you can use in order to make the request conditional on the state of the requested object. For example, you can:
Use the
x-goog-if-generation-match
precondition to execute a request only if the generation number in the precondition matches the generation number of the requested object. If you use0
instead of a generation number, the request only succeeds if there is no live object in your bucket matching the object named in the request.Use the
x-goog-if-metageneration-match
preconditions to execute a request only if the metageneration number in the header matches the metageneration number of the requested object.Use the
If-Modified-Since
precondition withGET
orHEAD
requests. These requests execute only if the time of creation for the most recent generation of the object — that is, the last modification of the object — occurred more recently than the time specified in the precondition.Use ETags and the
If-Match
orIf-None-Match
preconditions withGET
orHEAD
requests. These requests execute only if the requested object does or doesn't match the ETag specified in the precondition.Use multiple preconditions in the same request. For example, if you are using
x-goog-if-generation-match
, you can also usex-goog-if-metageneration-match
.
Preconditions in the JSON API
In the JSON API, you can obtain the generation and metageneration
numbers via the generation
and metageneration
properties of a response
containing an object or bucket resource. An object or bucket
resource is returned in the response body of a GET
request for the object
or for the bucket.
The following examples show how to retrieve an object's information using JSON
API requests with query parameters acting as preconditions. You can use the
ifGenerationMatch
, ifGenerationNotMatch
, ifMetagenerationMatch
, and
ifMetagenerationNotMatch
query parameters for operations such as compose,
insert, or rewrite. For more information, see the
JSON API Objects reference page.
ifGenerationMatch
This JSON API request uses the ifGenerationMatch
precondition. The API
only completes this request for the object with the generation number you
provided:
- Get an authorization access token from the OAuth 2.0 Playground. Configure the playground to use your own OAuth credentials.
Use
cURL
to call the JSON API with aGET
Object request:curl \ 'https://storage.googleapis.com/storage/v1/b/BUCKET_NAME/o/OBJECT_NAME?ifGenerationMatch=GENERATION' \ --header 'Authorization: Bearer OAUTH2_TOKEN' \ --header 'Accept: application/json' \ --compressed
Where:
BUCKET_NAME
is the name of the bucket where the object is located. For example,my-bucket
.OBJECT_NAME
is the name of the object for which you are retrieving information. For example,dog.png
.OAUTH2_TOKEN
is the access token you generated in Step 1.GENERATION
is the generation number of the object for which you are retrieving information. For example,1122334455667788
.If no object is found matching the given precondition, in this case the generation number, the response body returns with the following error message:
"message": "Precondition Failed"
ifMetagenerationNotMatch
Here is a request that uses the ifMetagenerationNotMatch
precondition,
a query parameter that makes the success of the request dependent on an object's
metageneration number being something other than the number specified in the
precondition. This parameter lets you exclude a specific version of the object
from the query:
- Get an authorization access token from the OAuth 2.0 Playground. Configure the playground to use your own OAuth credentials.
Use
cURL
to call the JSON API with aGET
Object request:curl \ 'https://storage.googleapis.com/storage/v1/b/BUCKET_NAME/o/OBJECT_NAME?ifMetagenerationNotMatch=METAGENERATION' \ --header 'Authorization: Bearer OAUTH2_TOKEN' \ --header 'Accept: application/json' \ --compressed
Where:
BUCKET_NAME
is the name of the bucket where the object is located. For example,my-bucket
.OBJECT_NAME
is the name of the object for which you are retrieving information. For example,dog.png
.OAUTH2_TOKEN
is the access token you generated in Step 1.METAGENERATION
is the metageneration number of the object for which you are excluding from your search. For example,5
.If no other object exists, a response is not returned.
Limitations
Note that generation and metageneration preconditions are not accepted for ACL operations; use the access-control entry resource ETag instead. This can be found inside each access-control entry resource, which is also accessible from the containing object or bucket resource.
HTTP 1.1 ETags
The JSON API also supports HTTP 1.1 ETags and the corresponding HTTP If-Match
and If-None-Match
headers for all resources, including buckets, objects,
and ACLs. An ETag is returned as part of the response header whenever a
resource is returned, as well as included in the resource itself.
Use ETags and the If-Match
or If-None-Match
preconditions as
headers in requests. These requests execute only if the requested object does or
doesn't match the ETag specified in the precondition.
Examples of race conditions
The following section explores race conditions you may need to consider.
Simultaneous read-modify-write
A common pattern for updating bucket or object metadata involves reading the current state, applying modifications locally, and sending the modified metadata back to Cloud Storage for writing. This can be precarious if two or more independent processes attempt the sequence simultaneously.
Consider the following case: you want to add an ACL entry for a collaborator so they can access your bucket. At the same time, a co-worker wants to remove a separate collaborator who no longer needs access to the bucket.
To do this, both you and your co-worker read the same initial state for the bucket's metadata, and you each make your desired modification to the ACL entries of the bucket. When you write your modifications back to Cloud Storage, the metadata is updated correctly. Unfortunately, your changes are lost as soon as your co-worker uploads his modifications, because he had no way to take into account your update. As a result, the ACL entry for your collaborator is lost, she won't be able to access your bucket, and no one is aware of what happened (without going back and looking at the ACL entries).
Preventing the race condition
You and your co-worker can prevent this race condition by adding an
if-metageneration-match
precondition to each of your write operations. In
the precondition, you both use the metageneration number of the bucket, which
is part of the metadata you received in the initial read operation.
When your modifications add the ACL entry, the bucket's metageneration number
changes. Now that preconditions are being used, when your co-worker attempts to
write his version of the ACL entries, the metageneration number of the bucket
does not match the number in the precondition, and he is informed of the failed
update with a response code 412 Precondition Failed
. Having received this
response code, your co-worker can react accordingly, such as by performing a new
read-modify-write cycle using the updated metadata.
Multiple request retries
Cloud Storage is a distributed system. Because requests can fail due to network or service conditions, Google recommends that you retry failures with exponential backoff. However, due to the nature of distributed systems, sometimes these retries can cause surprising behavior.
Consider the following case: you want to delete a file, file.txt
, stored in
Cloud Storage. Afterward, you want to add a new file with the same name to
Cloud Storage.
To accomplish this, you issue a delete request to delete the object. However, a network condition — such as an intermediate router temporarily losing connectivity — prevents the request from reaching Cloud Storage, and you don't receive a response.
Because you didn't receive a response to the first request, you issue a second
delete request for the object, which is successful, and you receive a response
confirming the deletion. A minute later, you decide to upload a new file.txt
,
and your upload is successful.
A race condition arises if the router that lost connectivity subsequently
regains it and sends your original, seemingly lost, delete request onward to
Cloud Storage. When the request arrives at Cloud Storage, it succeeds
because there is a new file.txt
present. Cloud Storage sends a response
that you do not receive because your client stopped listening for it.
Not only does the new file get deleted, contrary to your intentions, but also
you are not aware the second deletion occurred.
The following diagram shows what happened:
Preventing the race condition
To prevent the above situation from occurring, you should begin by getting the
metadata for file.txt
in order to determine its current generation.
You then send the delete request with an if-generation-match
precondition
that uses the generation number. Using the precondition ensures that only
the object with that specific generation number gets deleted, regardless of
when the delete request reaches Cloud Storage or how many times the delete
request with the precondition is sent. With the if-generation-match
precondition, any unintended attempts to mutate a different generation of
file.txt
fail with the response code 412 Precondition Failed
.
Since similar network interruptions could cause race conditions for the
upload request that followed your delete request, you can avoid many of
these with an if-generation-match:0
applied to the upload
request. Using this precondition ensures that retries of the upload don't
accidentally write the object twice, because the precondition allows the
request to succeed only if there aren't any current generations of the object.
With these preconditions in place, you protect your data from accidentally being lost when performing the delete and upload requests. This can be seen in the following diagram:
Limitations of if-generation-match:0
if-generation-match:0
cannot prevent object creation from occurring twice if
the first object is deleted, because the absence of the object is not uniquely
identifiable. Consider the following case, in which no data is lost, but you
end with a file that you didn't expect to:
You begin with a
GET
request for the metadata offile.txt
to find its generation number. In the response, you find thatfile.txt
does not exist.Knowing this, you make a request to upload
file.txt
with theif-generation-match:0
precondition, but the request times out when an intermediate router temporarily loses connectivity.Having failed the first time, you retry your upload request, again with the
if-generation-match:0
precondition. This time the request succeeds.Soon after, you send a request to delete
file.txt
, which succeeds.If the router that lost connectivity now regains connectivity and sends your first upload request on to Cloud Storage, the precondition that went with the request is still a match, so
file.txt
is recreated. With or without the precondition,file.txt
unexpectedly uploads a second time.