Override Retry, Backoff, and Idempotency Policies
When it is safe to do so, the library automatically retries requests that fail due to a transient error. The library then uses exponential backoff to backoff before trying again. Which operations are considered safe to retry, which errors are treated as transient failures, the details of the exponential backoff algorithm, and for how long the library retries are all configurable via policies.
This document provides examples showing how to override the default policies.
The policies can be set when the *Connection
object is created. The library provides default policies for any policy that is not set. The application can also override some (or all) policies when the *Client
object is created. This can be useful if multiple *Client
objects share the same *Connection
object, but you want different retry behavior in some of the clients. Finally, the application can override some retry policies when calling a specific member function.
The library uses three different options to control the retry loop. The options have per-client names.
Configuring the transient errors and retry duration
The *RetryPolicyOption
controls:
- Which errors are to be treated as transient errors.
- How long the library will keep retrying transient errors.
You can provide your own class for this option. The library also provides two built-in policies:
*LimitedErrorCountRetryPolicy
: stops retrying after a specified number of transient errors.*LimitedTimeRetryPolicy
: stops retrying after a specified time.
Note that a library may have more than one version of these classes. Their name match the *Client
and *Connection
object they are intended to be used with. Some *Client
objects treat different error codes as transient errors. In most cases, only kUnavailable is treated as a transient error.
Controlling the backoff algorithm
The *BackoffPolicyOption
controls how long the client library will wait before retrying a request that failed with a transient error. You can provide your own class for this option.
The only built-in backoff policy is ExponentialBackoffPolicy
. This class implements a truncated exponential backoff algorithm, with jitter. In summary, it doubles the current backoff time after each failure. The actual backoff time for an RPC is chosen at random, but never exceeds the current backoff. The current backoff is doubled after each failure, but never exceeds (or is "truncated") if it reaches a prescribed maximum.
Controlling which operations are retryable
The *IdempotencyPolicyOption
controls which requests are retryable, as some requests are never safe to retry.
Only one built-in idempotency policy is provided by the library. The name matches the name of the client it is intended for. For example, FooBarClient
will use FooBarIdempotencyPolicy
. This policy is very conservative.
Example
For example, this will override the retry policies for aiplatform_v1::DatasetServiceClient
:
auto options =
google::cloud::Options{}
.set<google::cloud::aiplatform_v1::
DatasetServiceConnectionIdempotencyPolicyOption>(
CustomIdempotencyPolicy().clone())
.set<google::cloud::aiplatform_v1::DatasetServiceRetryPolicyOption>(
google::cloud::aiplatform_v1::
DatasetServiceLimitedErrorCountRetryPolicy(3)
.clone())
.set<google::cloud::aiplatform_v1::DatasetServiceBackoffPolicyOption>(
google::cloud::ExponentialBackoffPolicy(
/*initial_delay=*/std::chrono::milliseconds(200),
/*maximum_delay=*/std::chrono::seconds(45),
/*scaling=*/2.0)
.clone());
auto connection = google::cloud::aiplatform_v1::MakeDatasetServiceConnection(
"location-unused-in-this-example", options);
// c1 and c2 share the same retry policies
auto c1 = google::cloud::aiplatform_v1::DatasetServiceClient(connection);
auto c2 = google::cloud::aiplatform_v1::DatasetServiceClient(connection);
// You can override any of the policies in a new client. This new client
// will share the policies from c1 (or c2) *except* for the retry policy.
auto c3 = google::cloud::aiplatform_v1::DatasetServiceClient(
connection,
google::cloud::Options{}
.set<google::cloud::aiplatform_v1::DatasetServiceRetryPolicyOption>(
google::cloud::aiplatform_v1::
DatasetServiceLimitedTimeRetryPolicy(std::chrono::minutes(5))
.clone()));
// You can also override the policies in a single call:
// c3.SomeRpc(..., google::cloud::Options{}
// .set<google::cloud::aiplatform_v1::DatasetServiceRetryPolicyOption>(
// google::cloud::aiplatform_v1::DatasetServiceLimitedErrorCountRetryPolicy(10).clone()));
This assumes you have created a custom idempotency policy. Such as:
class CustomIdempotencyPolicy : public google::cloud::aiplatform_v1::
DatasetServiceConnectionIdempotencyPolicy {
public:
~CustomIdempotencyPolicy() override = default;
std::unique_ptr<
google::cloud::aiplatform_v1::DatasetServiceConnectionIdempotencyPolicy>
clone() const override {
return std::make_unique<CustomIdempotencyPolicy>(*this);
}
// Override inherited functions to define as needed.
};
This will override the polling policies for aiplatform_v1::DatasetServiceClient
// The polling policy controls how the client waits for long-running
// operations. `GenericPollingPolicy<>` combines existing policies.
// In this case, keep polling until the operation completes (with success
// or error) or 45 minutes, whichever happens first. Initially pause for
// 10 seconds between polling requests, increasing the pause by a factor
// of 4 until it becomes 2 minutes.
auto options =
google::cloud::Options{}
.set<google::cloud::aiplatform_v1::DatasetServicePollingPolicyOption>(
google::cloud::GenericPollingPolicy<
google::cloud::aiplatform_v1::
DatasetServiceRetryPolicyOption::Type,
google::cloud::aiplatform_v1::
DatasetServiceBackoffPolicyOption::Type>(
google::cloud::aiplatform_v1::
DatasetServiceLimitedTimeRetryPolicy(
/*maximum_duration=*/std::chrono::minutes(45))
.clone(),
google::cloud::ExponentialBackoffPolicy(
/*initial_delay=*/std::chrono::seconds(10),
/*maximum_delay=*/std::chrono::minutes(2),
/*scaling=*/4.0)
.clone())
.clone());
auto connection = google::cloud::aiplatform_v1::MakeDatasetServiceConnection(
"location-unused-in-this-example", options);
// c1 and c2 share the same polling policies.
auto c1 = google::cloud::aiplatform_v1::DatasetServiceClient(connection);
auto c2 = google::cloud::aiplatform_v1::DatasetServiceClient(connection);
Follow these links to find examples for other *Client
classes:
aiplatform_v1::DatasetServiceClient
aiplatform_v1::DeploymentResourcePoolServiceClient
aiplatform_v1::EndpointServiceClient
aiplatform_v1::EvaluationServiceClient
aiplatform_v1::FeatureOnlineStoreAdminServiceClient
aiplatform_v1::FeatureOnlineStoreServiceClient
aiplatform_v1::FeatureRegistryServiceClient
aiplatform_v1::FeaturestoreServiceClient
aiplatform_v1::FeaturestoreOnlineServingServiceClient
aiplatform_v1::GenAiTuningServiceClient
aiplatform_v1::IndexServiceClient
aiplatform_v1::IndexEndpointServiceClient
aiplatform_v1::JobServiceClient
aiplatform_v1::LlmUtilityServiceClient
aiplatform_v1::MatchServiceClient
aiplatform_v1::MetadataServiceClient
aiplatform_v1::MigrationServiceClient
aiplatform_v1::ModelServiceClient
aiplatform_v1::ModelGardenServiceClient
aiplatform_v1::NotebookServiceClient
aiplatform_v1::PersistentResourceServiceClient
aiplatform_v1::PipelineServiceClient
aiplatform_v1::PredictionServiceClient
aiplatform_v1::ScheduleServiceClient
aiplatform_v1::SpecialistPoolServiceClient
aiplatform_v1::TensorboardServiceClient
aiplatform_v1::VizierServiceClient