This page describes retry strategies such as truncated exponential backoff for failed requests to Cloud Storage. Most Cloud Storage tools provide automatic retries so you don't need to implement your own retry strategy.
How Cloud Storage tools implement retry strategies
Console
The Cloud console sends requests to Cloud Storage on your behalf and handles any necessary backoff.
gsutil
gsutil retries the errors listed in the Response section without requiring you to take additional action. You may have to take action for other errors, such as the following:
Invalid credentials or insufficient permissions.
Network unreachable because of a proxy configuration problem.
Individual operations that fail within a command where you use the
-m
top-level flag.
For retryable errors, gsutil retries requests using a truncated binary exponential backoff strategy. By default, gsutil retries 23 times over 1+2+4+8+16+32+60... seconds for about 10 minutes:
- If a request fails, wait a random period between [0..1] seconds and retry;
- If the request fails again, wait a random period between [0..2] seconds and retry;
- If the request fails again, wait a random period between [0..4] seconds and retry;
- And so on, up to 23 retries, with each retry period bounded by a default maximum of 60 seconds.
To configure the number of retries and the maximum delay of any individual
retry, edit the num_retries
and max_retry_delay
configuration variables
in the "[Boto]"
section of the .boto
config file.
Client libraries
C++
Default retry behavior
By default, operations support retries for the following HTTP error codes, as well as any socket errors that indicate the connection was lost or never successfully established.
408 Request Timeout
429 Too Many Requests
500 Internal Server Error
502 Bad Gateway
503 Service Unavailable
504 Gateway Timeout
All exponential backoff and retry settings in the C++ library are configurable. If the algorithms implemented in the library do not support your needs, you can provide custom code to implement your own strategies.
Setting | Default value |
---|---|
Auto retry | True |
Maximum time retrying a request | 15 minutes |
Initial wait (backoff) time | 1 second |
Wait time multiplier per iteration | 2 |
Maximum amount of wait time | 5 minutes |
By default, the C++ library retries all operations with retryable
errors, even those that never idempotent and can delete or create multiple
resources when repeatedly successful. To only retry idempotent operations,
use the google::cloud::storage::StrictIdempotencyPolicy
.
Customize retries
To customize the retry behavior, provide values for the following options
when you initialize the google::cloud::storage::Client
object:
google::cloud::storage::RetryPolicyOption
: The library providesgoogle::cloud::storage::LimitedErrorCountRetryPolicy
andgoogle::cloud::storage::LimitedTimeRetryPolicy
classes. You can provide your own class, which must implement thegoogle::cloud::RetryPolicy
interface.google::cloud::storage::BackoffPolicyOption
: The library provides thegoogle::cloud::storage::ExponentialBackoffPolicy
class. You can provide your own class, which must implement thegoogle::cloud::storage::BackoffPolicy
interface.google::cloud::storage::IdempotencyPolicyOption
: The library provides thegoogle::cloud::storage::StrictIdempotencyPolicy
andgoogle::cloud::storage::AlwaysRetryIdempotencyPolicy
classes. You can provide your own class, which must implement thegoogle::cloud::storage::IdempotencyPolicy
interface.
C#
The C# client library uses exponential backoff by default.
Go
Default retry behavior
By default, operations support retries for the following errors:
- Connection errors:
io.ErrUnexpectedEOF
: This may occur due to transient network issues.url.Error
containingconnection refused
: This may occur due to transient network issues.url.Error
containingconnection reset by peer
: This means that GCP has reset the connection.net.ErrClosed
: This means that GCP has closed the connection.
- HTTP codes:
408 Request Timeout
429 Too Many Requests
500 Internal Server Error
502 Bad Gateway
503 Service Unavailable
504 Gateway Timeout
- Errors that implement the
Temporary()
interface and give a value oferr.Temporary() == true
- Any of the above errors that have been wrapped using Go 1.13 error wrapping
All exponential backoff settings in the Go library are configurable. By default, operations in Go use the following settings for exponential backoff (defaults are taken from gax):
Setting | Default value (in seconds) |
---|---|
Auto retry | True if idempotent |
Max number of attempts | No limit |
Initial retry delay | 1 second |
Retry delay multiplier | 2.0 |
Maximum retry delay | 30 seconds |
Total timeout (resumable upload chunk) | 32 seconds |
Total timeout (all other operations) | No limit |
In general, retrying continues indefinitely unless the controlling
context is canceled, the client is closed, or a non-transient error is
received. To stop retries from continuing, use context timeouts or
cancellation. The only exception to this behavior is when performing
resumable uploads using Writer, where the data is
large enough that it requires multiple requests. In this scenario, each
chunk times out and stops retrying after 32 seconds by default. You can
adjust the default timeout by changing Writer.ChunkRetryDeadline
.
There is a subset of Go operations that are conditionally idempotent (conditionally safe to retry). These operations only retry if they meet specific conditions:
GenerationMatch
orGeneration
- Safe to retry if a
GenerationMatch
precondition was applied to the call, or ifObjectHandle.Generation
was set.
- Safe to retry if a
MetagenerationMatch
- Safe to retry if a
MetagenerationMatch
precondition was applied to the call.
- Safe to retry if a
Etag
- Safe to retry if the method inserts an
etag
into the JSON request body. Only used inHMACKeyHandle.Update
whenHmacKeyMetadata.Etag
has been set.
- Safe to retry if the method inserts an
RetryPolicy
is set to RetryPolicy.RetryIdempotent
by default. See the
Customize retries section below for examples on how to modify the
default retry behavior.
Customize retries
When you initialize a storage client, a default retry configuration will be set. Unless they're overridden, the options in the config are set to the values in the table above. Users can configure non-default retry behavior for a single library call (using BucketHandle.Retryer and ObjectHandle.Retryer) or for all calls made by a client (using Client.SetRetry). To modify retry behavior, pass in the desired RetryOptions to one of these methods.
See the following code sample to learn how to customize your retry behavior.
Java
Default retry behavior
By default, operations support retries for the following errors:
- Connection errors:
Connection reset by peer
: This means that GCP has reset the connection.Unexpected connection closure
: This means GCP has closed the connection.
- HTTP codes:
408 Request Timeout
429 Too Many Requests
500 Internal Server Error
502 Bad Gateway
503 Service Unavailable
504 Gateway Timeout
All exponential backoff settings in the Java library are configurable. By default, operations through Java use the following settings for exponential backoff:
Setting | Default value (in seconds) |
---|---|
Auto retry | True if idempotent |
Max number of attempts | 6 |
Initial retry delay | 1 second |
Retry delay multiplier | 2.0 |
Maximum retry delay | 32 seconds |
Total Timeout | 50 seconds |
Initial RPC Timeout | 50 seconds |
RPC Timeout Multiplier | 1.0 |
Max RPC Timeout | 50 seconds |
Connect Timeout | 20 seconds |
Read Timeout | 20 seconds |
For more information about the parameters above, see the Java reference
documentation for RetrySettings.Builder
and
HttpTransportOptions.Builder
.
There is a subset of Java operations that are conditionally idempotent (conditionally safe to retry). These operations only retry if they include specific arguments:
ifGenerationMatch
orgeneration
- Safe to retry if
ifGenerationMatch
orgeneration
was passed in as an option to the method.
- Safe to retry if
ifMetagenerationMatch
- Safe to retry if
ifMetagenerationMatch
was passed in as an option.
- Safe to retry if
StorageOptions.setStorageRetryStrategy
is set to
StorageRetryStrategy#getDefaultStorageRetryStrategy
by default. See the
Customize retries section below for examples on how to modify the
default retry behavior.
Customize retries
When you initialize Storage
, an instance of
RetrySettings
is initialized as well. Unless they are
overridden, the options in the RetrySettings
are set to the values in
the table above. To modify the default automatic retry behavior, pass the
custom StorageRetryStrategy
into the StorageOptions
used to construct
the Storage
instance. To modify any of the other scalar parameters, pass
a custom RetrySettings
into the StorageOptions
used to construct the
Storage
instance.
See the following example to learn how to customize your retry behavior:
Node.js
Default retry behavior
By default, operations support retries for the following error codes:
- Connection errors:
EAI_again
: This is a DNS lookup error. More information can be found here.Connection reset by peer
: This means that GCP has reset the connection.Unexpected connection closure
: This means GCP has closed the connection.
- HTTP codes:
408 Request Timeout
429 Too Many Requests
500 Internal Server Error
502 Bad Gateway
503 Service Unavailable
504 Gateway Timeout
All exponential backoff settings in the Node.js library are configurable. By default, operations through Node.js use the following settings for exponential backoff:
Setting | Default value (in seconds) |
---|---|
Auto retry | True if idempotent |
Maximum number of retries | 3 |
Initial wait time | 1 second |
Wait time multiplier per iteration | 2 |
Maximum amount of wait time | 64 seconds |
Default deadline | 600 seconds |
There is a subset of Node.js operations that are conditionally idempotent (conditionally safe to retry). These operations only retry if they include specific arguments:
ifGenerationMatch
orgeneration
- Safe to retry if
ifGenerationMatch
orgeneration
was passed in as an option to the method. Often, methods only accept one of these two parameters.
- Safe to retry if
ifMetagenerationMatch
- Safe to retry if
ifMetagenerationMatch
was passed in as an option.
- Safe to retry if
retryOptions.idempotencyStrategy
is set to
IdempotencyStrategy.RetryConditional
by default. See the
Customize retries section below for examples on how to modify the
default retry behavior.
Customize retries
When you initialize Cloud Storage, a retryOptions config
file is initialized as well. Unless they're overridden, the options
in the config are set to the values in the table above. To modify the
default retry behavior, pass the custom retry configuration
retryOptions
into the storage constructor upon initialization.
The Node.js client library can automatically use backoff strategies to
retry requests with the autoRetry
parameter.
See the following code sample to learn how to customize your retry behavior.
PHP
The PHP client library uses exponential backoff by default.
Python
Default retry behavior
By default, operations support retries for the following error codes:
- Connection errors:
requests.exceptions.ConnectionError
requests.exceptions.ChunkedEncodingError
(only for operations that fetch or send payload data to objects, like uploads and downloads)ConnectionError
- HTTP codes:
408 Request Timeout
429 Too Many Requests
500 Internal Server Error
502 Bad Gateway
503 Service Unavailable
504 Gateway Timeout
Operations through Python use the following default settings for exponential backoff:
Setting | Default value (in seconds) |
---|---|
Auto retry | True if idempotent |
Initial wait time | 1 |
Wait time multiplier per iteration | 2 |
Maximum amount of wait time | 60 |
Default deadline | 120 |
There is a subset of Python operations that are conditionally idempotent (conditionally safe to retry) when they include specific arguments. These operations only retry if a condition case passes:
DEFAULT_RETRY_IF_GENERATION_SPECIFIED
- Safe to retry if
generation
orif_generation_match
was passed in as an argument to the method. Often methods only accept one of these two parameters.
- Safe to retry if
DEFAULT_RETRY_IF_METAGENERATION_SPECIFIED
- Safe to retry if
if_metageneration_match
was passed in as an argument to the method.
- Safe to retry if
DEFAULT_RETRY_IF_ETAG_IN_JSON
- Safe to retry if the method inserts an
etag
into the JSON request body. ForHMACKeyMetadata.update()
this means etag must be set on theHMACKeyMetadata
object itself. For theset_iam_policy()
method on other classes, this means the etag must be set in the "policy" argument passed into the method.
- Safe to retry if the method inserts an
Customize retries
To modify the default retry behavior, create a copy of the
google.cloud.storage.retry.DEFAULT_RETRY
object by calling it with a
with_XXX
method. The Python client library automatically uses backoff
strategies to retry requests if you include the DEFAULT_RETRY
parameter.
Note that with_predicate
is not supported for operations that fetch or
send payload data to objects, like uploads and downloads. It's
recommended that you modify attributes one by one. For more information,
see the google-api-core Retry reference.
To configure your own conditional retry, create a
ConditionalRetryPolicy
object and wrap your custom Retry
object with DEFAULT_RETRY_IF_GENERATION_SPECIFIED
,
DEFAULT_RETRY_IF_METAGENERATION_SPECIFIED
, or
DEFAULT_RETRY_IF_ETAG_IN_JSON
.
See the following code sample to learn how to customize your retry behavior.
Ruby
The Ruby client library uses exponential backoff by default.
REST APIs
When calling the JSON or XML API directly, you should use the exponential backoff algorithm to implement your own retry strategy.
Implement your own retry strategy
This section describes how you can implement your own retry strategy. It also provides guidance for using the exponential backoff algorithm.
There's two factors that determine whether or not a request is safe to retry:
- The response that you receive from the request.
- The idempotency of the request.
Response
The response that you receive from your request indicates whether or not it's useful to retry the request. Responses related to transient problems are generally retryable. On the other hand, response related to permanent errors indicate you need to make changes, such as authorization or configuration changes, before it's useful to try the request again. The following responses indicate transient problems that are useful to retry:
- HTTP
408
,429
, and5xx
response codes. - Socket timeouts and TCP disconnects.
For more information, see the status and error codes for JSON and XML.
Idempotency
A request that is idempotent means it can be performed repeatedly and always leaves the targeted resource in the same end state. For example, listing requests are always idempotent, because such requests do not modify resources. On the other hand, creating a new Pub/Sub notification is never idempotent, because it creates a new notification ID each time the request succeeds.
The following are examples of conditions that make an operation idempotent:
The operation has the same observable effect on the targeted resource even when continually requested.
The operation only succeeds once.
The operation has no observable effect on the state of the targeted resource.
When you receive a retryable response, you should consider the idempotency of the request, because retrying requests that are not idempotent can lead to race conditions and other conflicts.
Conditional idempotency
A subset of requests are conditionally idempotent, which means they are only idempotent if they include specific optional arguments. Operations that are conditionally safe to retry should only be retried by default if the condition case passes. Cloud Storage accepts preconditions and ETags as condition cases for requests.
Idempotency of operations
The following table lists the Cloud Storage operations that fall into each category of idempotency.
Idempotency | Operations |
---|---|
Always idempotent |
|
Conditionally idempotent |
|
Never idempotent |
|
Exponential backoff algorithm
For requests that meet both the response and idempotency criteria, you should generally use truncated exponential backoff.
Truncated exponential backoff is a standard error handling strategy for network applications in which a client periodically retries a failed request with increasing delays between requests.
An exponential backoff algorithm retries requests exponentially, increasing the waiting time between retries up to a maximum backoff time. See the following workflow example to learn how exponential backoff works:
You make a request to Cloud Storage.
If the request fails, wait 1 +
random_number_milliseconds
seconds and retry the request.If the request fails, wait 2 +
random_number_milliseconds
seconds and retry the request.If the request fails, wait 4 +
random_number_milliseconds
seconds and retry the request.And so on, up to a
maximum_backoff
time.Continue waiting and retrying up to a maximum amount of time (
deadline
), but do not increase themaximum_backoff
wait period between retries.
where:
The wait time is min((2n +
random_number_milliseconds
),maximum_backoff
), withn
incremented by 1 for each iteration (request).random_number_milliseconds
is a random number of milliseconds less than or equal to 1000. This helps to avoid cases where many clients become synchronized and all retry at once, sending requests in synchronized waves. The value ofrandom_number_milliseconds
is recalculated after each retry request.maximum_backoff
is typically 32 or 64 seconds. The appropriate value depends on the use case.
You can continue retrying once you reach the maximum_backoff
time, but it's
recommended that you abort your request after a certain amount of time to
prevent your application from becoming unresponsive. For example, say a client
uses a maximum_backoff
time of 64 seconds. After reaching this value, the
client can retry every 64 seconds. The client then stops retrying after a
deadline
of 600 seconds.
How long clients should wait between retries and how many times they should retry depends on your use case and network conditions. For example, mobile clients of an application may need to retry more times and for longer intervals when compared to desktop clients of the same application.
If the retry requests fail after exceeding the maximum_backoff
plus any
additional time allowed for retries, report or log an error to Support.
What's next
- Learn how to retry requests in Storage Transfer Service with Java or Python.
- Learn more about request preconditions.