Override Retry, Backoff, and Re-Run Policies

When it is safe to do so, the library automatically retries requests that fail due to a transient error. The library then uses [exponential backoff] to delay before trying again. Which operations are considered safe to retry, which errors are treated as transient failures, the parameters of the exponential backoff algorithm, and the limits of library retries, are all configurable via policies.

The library provides defaults for any policy that is not set. This document provides examples showing how to override those default policies.

The policies can be set when a Connection, object is created. Some of the policies can also be overridden when the corresponding Client object is created. This can be useful if multiple Client objects share the same Connection object, but you want different retry behavior in some of those clients. Finally, some retry policies can be overridden when calling a specific Client member function.

The library uses two different policy options to control the retry loops.

Configuring the transient errors and retry duration

The SpannerRetryPolicyOption controls:

  • Which errors are to be treated as transient errors.
  • How long the library will keep retrying transient errors.

You can provide your own class for this option. The library also provides two built-in policies:

  • LimitedErrorCountRetryPolicy: stops retrying after a specified number of transient errors.
  • LimitedTimeRetryPolicy: stops retrying after a specified time.

In most cases, only kUnavailable and kResourceExhausted are treated as a transient errors.

See Also

google::cloud::spanner::SpannerRetryPolicyOption

See Also

google::cloud::spanner::RetryPolicy

See Also

google::cloud::spanner::LimitedErrorCountRetryPolicy

See Also

google::cloud::spanner::LimitedTimeRetryPolicy

Controlling the backoff algorithm

The SpannerBackoffPolicyOption controls how long the client library will wait before retrying a request that failed with a transient error. You can provide your own class for this option.

The only built-in backoff policy is ExponentialBackoffPolicy. This class implements a truncated exponential backoff algorithm, with jitter. In summary, it doubles the current backoff time after each failure. The actual backoff time for an RPC is chosen at random, but never exceeds the current backoff. The current backoff is doubled after each failure, but never exceeds (or is "truncated" if it reaches) a prescribed maximum.

See Also

google::cloud::spanner::SpannerBackoffPolicyOption

See Also

google::cloud::spanner::BackoffPolicy

See Also

google::cloud::spanner::ExponentialBackoffPolicy

Example

For example, this will override the retry and backoff policies through options passed to spanner::MakeConnection():

  namespace spanner = ::google::cloud::spanner;
  using ::google::cloud::StatusOr;
  [](std::string const& project_id, std::string const& instance_id,
     std::string const& database_id) {
    // Use a truncated exponential backoff with jitter to wait between
    // retries:
    //   https://en.wikipedia.org/wiki/Exponential_backoff
    //   https://cloud.google.com/storage/docs/exponential-backoff
    auto client = spanner::Client(spanner::MakeConnection(
        spanner::Database(project_id, instance_id, database_id),
        google::cloud::Options{}
            .set<spanner::SpannerRetryPolicyOption>(
                std::make_shared<spanner::LimitedTimeRetryPolicy>(
                    /*maximum_duration=*/std::chrono::seconds(60)))
            .set<spanner::SpannerBackoffPolicyOption>(
                std::make_shared<spanner::ExponentialBackoffPolicy>(
                    /*initial_delay=*/std::chrono::milliseconds(500),
                    /*maximum_delay=*/std::chrono::seconds(16),
                    /*scaling=*/1.5))));

    std::int64_t rows_inserted;
    auto commit_result = client.Commit(
        [&client, &rows_inserted](
            spanner::Transaction txn) -> StatusOr<spanner::Mutations> {
          auto insert = client.ExecuteDml(
              std::move(txn),
              spanner::SqlStatement(
                  "INSERT INTO Singers (SingerId, FirstName, LastName)"
                  "  VALUES (20, 'George', 'Washington')"));
          if (!insert) return std::move(insert).status();
          rows_inserted = insert->RowsModified();
          return spanner::Mutations{};
        });
    if (!commit_result) throw std::move(commit_result).status();
    std::cout << "Rows inserted: " << rows_inserted;
  }

Controlling which commits are rerunnable

The library also uses a special TransactionRerunPolicy to control how the commit of a read-write transaction will be reattempted after a failure with a rerunnable status (typically kAborted). The lock priority of the commit increases after each rerun, meaning that the next attempt has a slightly better chance of success.

You can provide your own class for this policy. The library also provides two built-in policies:

  • LimitedErrorCountTransactionRerunPolicy: stops rerunning the commit after a specified number of rerunnable errors.
  • LimitedTimeTransactionRerunPolicy: stops rerunnable the commit after a specified time.
void CommitWithPolicies(google::cloud::spanner::Client client) {
  using ::google::cloud::StatusOr;
  namespace spanner = ::google::cloud::spanner;
  auto commit = client.Commit(
      [&client](spanner::Transaction txn) -> StatusOr<spanner::Mutations> {
        auto update = client.ExecuteDml(
            std::move(txn),
            spanner::SqlStatement(
                "UPDATE Albums SET MarketingBudget = MarketingBudget * 2"
                "  WHERE SingerId = 1 AND AlbumId = 1"));
        if (!update) return std::move(update).status();
        return spanner::Mutations{};
      },
      // Retry for up to 42 minutes.
      spanner::LimitedTimeTransactionRerunPolicy(std::chrono::minutes(42))
          .clone(),
      // After a failure backoff for 2 seconds (with jitter), then triple the
      // backoff time on each retry, up to 5 minutes.
      spanner::ExponentialBackoffPolicy(std::chrono::seconds(2),
                                        std::chrono::minutes(5), 3.0)
          .clone());
  if (!commit) throw std::move(commit).status();
  std::cout << "commit-with-policies was successful\n";
}
See Also

google::cloud::spanner::TransactionRerunPolicy

See Also

google::cloud::spanner::LimitedErrorCountTransactionRerunPolicy

See Also

google::cloud::spanner::LimitedTimeTransactionRerunPolicy

More Information

See Also

google::cloud::Options

See Also

google::cloud::RetryPolicy

See Also

google::cloud::BackoffPolicy Follow these links to find examples for other spanner *Client classes: