Truncated exponential backoff is a standard error handling
strategy for network applications in which a client periodically retries a
failed request with increasing delays between requests. Clients should use
truncated exponential backoff for all requests to Google Cloud Storage that return HTTP
5xx
and 429
response codes, including uploads and downloads of data or
metadata.
Understanding how truncated exponential backoff works is important if you are:
-
Building client applications that use the Google Cloud Storage XML API or JSON API directly.
-
Accessing Google Cloud Storage through a client library. Note that some client libraries, such as the Cloud Storage Client Library for Node.js, have built-in exponential backoff.
-
Using the
gsutil
command line tool, which has configurable retry handling.
If you are using the Google Cloud Platform Console, the console sends requests to Google Cloud Storage on your behalf and will handle any necessary backoff.
Example algorithm
An exponential backoff algorithm retries requests exponentially, increasing the waiting time between retries up to a maximum backoff time. An example is:
-
Make a request to Google Cloud Storage.
-
If the request fails, wait 1 +
random_number_milliseconds
seconds and retry the request. -
If the request fails, wait 2 +
random_number_milliseconds
seconds and retry the request. -
If the request fails, wait 4 +
random_number_milliseconds
seconds and retry the request. -
And so on, up to a
maximum_backoff
time. -
Continue waiting and retrying up to some maximum number of retries, but do not increase the wait period between retries.
where:
-
The wait time is min(((2^
n
)+random_number_milliseconds
),maximum_backoff
), withn
incremented by 1 for each iteration (request). -
random_number_milliseconds
is a random number of milliseconds less than or equal to 1000. This helps to avoid cases where many clients get synchronized by some situation and all retry at once, sending requests in synchronized waves. The value ofrandom_number_milliseconds
is recalculated after each retry request. -
maximum_backoff
is typically 32 or 64 seconds. The appropriate value depends on the use case.
It's okay to continue retrying once you reach the maximum_backoff
time.
Retries after this point do not need to continue increasing backoff time. For
example, if a client uses an maximum_backoff
time of 64 seconds, then after
reaching this value, the client can retry every 64 seconds. At some point,
clients should be prevented from retrying infinitely.
How long clients should wait between retries and how many times they should retry depends on your use case and network conditions. For example, mobile clients of an application may need to retry more times and for longer intervals when compared to desktop clients of the same application.
If the retry requests fail after exceeding the maximum_backoff
plus any
additional time allowed for retries, report or log an error using one of the
methods listed under Support & help.
Example implementations
Examples of truncated exponential backoff used with Google Cloud Storage include:
-
A boto example for resumable uploads.
-
Retrying requests in Storage Transfer Service with Java or Python.
-
Dealing with JSON API upload errors using exponential backoff.
-
Google Cloud Storage using exponential backoff to send object change notifications to notification subscribers.
Examples of backoff implemented in client libraries that you can use with Google Cloud Storage include:
-
Retrying library for Python.
-
Google Cloud Client Libraries for Node.js can automatically use backoff strategies to retry requests with the
autoRetry
parameter.