Transactions

Introduction

A transaction in Cloud Spanner is a set of reads and writes that execute atomically at a single logical point in time across columns, rows, and tables in a database.

Cloud Spanner supports two transaction modes:

  • Locking read-write. This type of transaction is the only transaction type that supports writing data into Cloud Spanner. These transactions rely on pessimistic locking and, if necessary, two-phase commit. Locking read-write transactions may abort, requiring the application to retry.

  • Read-only. This transaction type provides guaranteed consistency across several reads, but does not allow writes. Read-only transactions can be configured to read at timestamps in the past. Read-only transactions do not need to be committed and do not take locks.

Because read-only transactions are much faster than locking read-write transactions, we strongly recommend that you do all of your transaction reads in read-only transactions if possible, and only use locking read-write under the scenarios described in the next section.

This page describes the general properties and semantics of transactions in Cloud Spanner and introduces the read-write and read-only transaction interfaces in Cloud Spanner.

Read-write transactions

Here are scenarios in which you should use a locking read-write transaction:

  • If you do a write that depends on the result of one or more reads, you should do that write and the read(s) in the same read-write transaction.
    • Example: double the balance of bank account A. The read of A's balance should be in the same transaction as the write to replace the balance with the doubled value.

  • If you do one or more writes that need to be committed atomically, you should do those writes in the same read-write transaction.
    • Example: transfer $200 from account A to account B. Both of the writes (one to decrease A by $200 and one to increase B by $200) and the reads of initial account balances should be in the same transaction.

  • If you might do one or more writes, depending on the results of one or more reads, you should do those writes and reads in the same read-write transaction, even if the write(s) don't end up executing.
    • Example: transfer $200 from bank account A to bank account B if A's current balance is greater than $500. Your transaction should contain a read of A's balance and a conditional statement that contains the writes.

Here is a scenario in which you should not use a locking read-write transaction:

  • If you are only doing reads, and you can express your read using a single read method, you should use that single read method or a read-only transaction. Single reads do not lock, unlike read-write transactions.

Properties

A read-write transaction in Cloud Spanner executes a set of reads and writes atomically at a single logical point in time. Furthermore, the timestamp at which read-write transactions execute match wall clock time, and the serialization order matches the timestamp order.

Why use a read-write transaction? Read-write transactions provide the ACID properties of relational databases (In fact, Cloud Spanner read-write transactions offer even stronger guarantees than traditional ACID; see the Semantics section below.).

Isolation

Transactions that read and write

Here are the isolation properties you get for a read-write transaction that contains a series of reads and writes:

  • All reads within that transaction return data from the same timestamp.
  • If a transaction successfully commits, then no other writer modified the data that was read in the transaction after it was read.
  • These properties hold even for reads that returned no rows, and the gaps between rows returned by range reads: row non-existence counts as data.
  • All writes within that transaction are committed at the same timestamp.
  • All writes within that transaction are only visible after the transaction commits.

The effect is that all reads and writes appear to have occurred at a single point in time, both from the perspective of the transaction itself and from the perspective of other readers and writers to the Cloud Spanner database. In other words, the reads and the writes end up occurring at the same timestamp (see an illustration of this in the Serializability section below).

Transactions that only read

The guarantees for a read-write transaction that only reads are similar: all reads within that transaction return data from the same timestamp, even for row non-existence. One difference is that if you read data, and later commit the read-write transaction without any writes, there is no guarantee that the data did not change in the database after the read and before the commit. If you want to know whether data has changed since you read it last, the best approach is to read it again (either in a read-write transaction, or using a strong read.) Also, for efficiency, if you know in advance that you'll only be reading and not writing, you should use a read-only transaction instead of a read-write transaction.

Atomicity, Consistency, Durability

In addition to the Isolation property, Cloud Spanner provides Atomicity (if any of the writes in the transaction commit, they all commit), Consistency (the database remains in a consistent state after the transaction) and Durability (committed data stays committed.)

Benefits of these properties

Because of these properties, as an application developer, you can focus on the correctness of each transaction on its own, without worrying about how to protect its execution from other transactions that might execute at the same time.

Interface

Cloud Spanner provides an interface for executing a body of work in the context of a read-write transaction, with retries for transaction aborts. Here's a bit of context to explain this point: a Cloud Spanner transaction may have to be tried multiple times before it commits. For example, if two transactions attempt to work on data at the same time in a way that might cause deadlock, Cloud Spanner aborts one of them so that the other transaction can make progress. (More rarely, transient events within Cloud Spanner may result in some transactions aborting.) Since transactions are atomic, an aborted transaction has no visible effect on the database. Therefore, transactions should be executed by retrying them until they succeed.

When you use a transaction in the Cloud Spanner API, you define the body of a transaction (i.e., the reads and writes to perform on one or more tables in a database) in the form of a function object. Under the hood, Cloud Spanner runs the function repeatedly until the transaction commits or a non-retryable error is encountered.

Example

Assume you added a MarketingBudget column to the Albums table shown in the Schema and Data Model page:

CREATE TABLE Albums (
  SingerId        INT64 NOT NULL,
  AlbumId         INT64 NOT NULL,
  AlbumTitle      STRING(MAX),
  MarketingBudget INT64
) PRIMARY KEY (SingerId, AlbumId);

Your marketing department decides to do a marketing push for the album keyed by Albums (1, 1) and has asked you to move $200,000 from the budget of Albums (2, 2), if that album's budget is at least $300,000, to the budget for Albums (1, 1). You should use a locking read-write transaction for this operation because the transaction might do writes, depending on the result of a read.

The following shows how to execute a read-write transaction:

Go

func writeWithTransaction(ctx context.Context, w io.Writer, client *spanner.Client) error {
	_, err := client.ReadWriteTransaction(ctx, func(txn *spanner.ReadWriteTransaction) error {
		getBudget := func(ctx context.Context, txn *spanner.ReadWriteTransaction, key spanner.Key) (int64, error) {
			row, err := txn.ReadRow(ctx, "Albums", key, []string{"MarketingBudget"})
			if err != nil {
				return 0, err
			}
			var budget int64
			if err := row.Column(0, &budget); err != nil {
				return 0, err
			}
			return budget, nil
		}
		album2Budget, err := getBudget(ctx, txn, spanner.Key{2, 2})
		if err != nil {
			return err
		}
		if album2Budget >= 300000 {
			album1Budget, err := getBudget(ctx, txn, spanner.Key{1, 1})
			if err != nil {
				return err
			}
			const transfer = 200000
			album1Budget += transfer
			album2Budget -= transfer
			cols := []string{"SingerId", "AlbumId", "MarketingBudget"}
			txn.BufferWrite([]*spanner.Mutation{
				spanner.Update("Albums", cols, []interface{}{1, 1, album1Budget}),
				spanner.Update("Albums", cols, []interface{}{2, 2, album2Budget}),
			})
		}
		return nil
	})
	return err
}

Java

static void writeWithTransaction(DatabaseClient dbClient) {
  dbClient
      .readWriteTransaction()
      .run(
          new TransactionCallable<Void>() {
            @Override
            public Void run(TransactionContext transaction) throws Exception {
              // Transfer marketing budget from one album to another. We do it in a transaction to
              // ensure that the transfer is atomic.
              Struct row =
                  transaction.readRow("Albums", Key.of(2, 2), Arrays.asList("MarketingBudget"));
              long album2Budget = row.getLong(0);
              // Transaction will only be committed if this condition still holds at the time of
              // commit. Otherwise it will be aborted and the callable will be rerun by the
              // client library.
              if (album2Budget >= 300000) {
                long album1Budget =
                    transaction
                        .readRow("Albums", Key.of(1, 1), Arrays.asList("MarketingBudget"))
                        .getLong(0);
                long transfer = 200000;
                album1Budget += transfer;
                album2Budget -= transfer;
                transaction.buffer(
                    Mutation.newUpdateBuilder("Albums")
                        .set("SingerId")
                        .to(1)
                        .set("AlbumId")
                        .to(1)
                        .set("MarketingBudget")
                        .to(album1Budget)
                        .build());
                transaction.buffer(
                    Mutation.newUpdateBuilder("Albums")
                        .set("SingerId")
                        .to(2)
                        .set("AlbumId")
                        .to(2)
                        .set("MarketingBudget")
                        .to(album2Budget)
                        .build());
              }
              return null;
            }
          });
}

Node.js

// This sample transfers 200,000 from the MarketingBudget field
// of the second Album to the first Album. Make sure to run the
// addColumn and updateData samples first (in that order).

// Imports the Google Cloud client library
const Spanner = require('@google-cloud/spanner');

// Instantiates a client
const spanner = Spanner();

// Uncomment these lines to specify the instance and database to use
// const instanceId = 'my-instance';
// const databaseId = 'my-database';

// Gets a reference to a Cloud Spanner instance and database
const instance = spanner.instance(instanceId);
const database = instance.database(databaseId);

// Gets a transaction object that captures the database state
// at a specific point in time
let transaction, firstBudget, secondBudget;
const transferAmount = 200000;
const minimumAmountToTransfer = 300000;

database.runTransaction()
  .then((results) => {
    transaction = results[0];

    const queryOne = {
      columns: [`MarketingBudget`],
      keys: [2, 2] // SingerId: 2, AlbumId: 2
    };

    const queryTwo = {
      columns: ['MarketingBudget'],
      keys: [1, 1] // SingerId: 1, AlbumId: 1
    };

    return Promise.all([
      // Reads the second album's budget
      transaction.read('Albums', queryOne).then((results) => {
        // Gets second album's budget
        // Note: MarketingBudget is an INT64, which comes from Cloud Spanner
        // as a string - so we convert it to a number with parseInt()
        const rows = results[0].map((row) => row.toJSON());
        secondBudget = parseInt(rows[0].MarketingBudget.value);
        console.log(`The second album's marketing budget: ${secondBudget}`);

        // Makes sure the second album's budget is sufficient
        if (secondBudget < minimumAmountToTransfer) {
          throw new Error(`The second album's budget (${secondBudget}) is less than the minimum required amount to transfer.`);
        }
      }),

      // Reads the first album's budget
      transaction.read('Albums', queryTwo).then((results) => {
        // Gets first album's budget
        // As above, MarketingBudget is an INT64 and comes as a string
        const rows = results[0].map((row) => row.toJSON());
        firstBudget = parseInt(rows[0].MarketingBudget.value);
        console.log(`The first album's marketing budget: ${firstBudget}`);
      })
    ]);
  })
  .then(() => {
    // Transfer the budgets between the albums
    console.log(firstBudget, secondBudget);
    firstBudget += transferAmount;
    secondBudget -= transferAmount;

    console.log(firstBudget, secondBudget);

    // Update the database
    // Note: Cloud Spanner interprets Node.js numbers as FLOAT64s, so they
    // must be converted (back) to strings before being inserted as INT64s.
    transaction.update('Albums', [
      { SingerId: '1', AlbumId: '1', MarketingBudget: firstBudget.toString() },
      { SingerId: '2', AlbumId: '2', MarketingBudget: secondBudget.toString() }
    ]);
  })
  // Commits the transaction and send the changes to the database
  .then(() => transaction.commit())
  .then(() => {
    // Logs success
    console.log(`Successfully executed read-write transaction to transfer ${transferAmount} from Album 2 to Album 1.`);
  });

Python

def read_write_transaction(instance_id, database_id):
    """Performs a read-write transaction to update two sample records in the
    database.

    This will transfer 200,000 from the `MarketingBudget` field for the second
    Album to the first Album. If the `MarketingBudget` is too low, it will
    raise an exception.

    Before running this sample, you will need to run the `update_data` sample
    to populate the fields.
    """
    spanner_client = spanner.Client()
    instance = spanner_client.instance(instance_id)
    database = instance.database(database_id)

    def update_albums(transaction):
        # Read the second album budget.
        second_album_keyset = spanner.KeySet(keys=[(2, 2)])
        second_album_result = transaction.read(
            table='Albums', columns=('MarketingBudget',),
            keyset=second_album_keyset, limit=1)
        second_album_row = list(second_album_result)[0]
        second_album_budget = second_album_row[0]

        transfer_amount = 200000

        if second_album_budget < 300000:
            # Raising an exception will automatically roll back the
            # transaction.
            raise ValueError(
                'The second album doesn\'t have enough funds to transfer')

        # Read the first album's budget.
        first_album_keyset = spanner.KeySet(keys=[(1, 1)])
        first_album_result = transaction.read(
            table='Albums', columns=('MarketingBudget',),
            keyset=first_album_keyset, limit=1)
        first_album_row = list(first_album_result)[0]
        first_album_budget = first_album_row[0]

        # Update the budgets.
        second_album_budget -= transfer_amount
        first_album_budget += transfer_amount
        print(
            'Setting first album\'s budget to {} and the second album\'s '
            'budget to {}.'.format(
                first_album_budget, second_album_budget))

        # Update the rows.
        transaction.update(
            table='Albums',
            columns=(
                'SingerId', 'AlbumId', 'MarketingBudget'),
            values=[
                (1, 1, first_album_budget),
                (2, 2, second_album_budget)])

    database.run_in_transaction(update_albums)

    print('Transaction complete.')

Example of executing a locking read-write transaction

Semantics

Serializability

Cloud Spanner provides 'serializability', which means that all transactions appear as if they executed in a serial order, even if some of the reads, writes, and other operations of distinct transactions actually occurred in parallel. Cloud Spanner assigns commit timestamps that reflect the order of committed transactions to implement this property. In fact, Cloud Spanner offers stronger guarantees than serializability - transactions commit in an order that is reflected in their commit timestamps, and these commit timestamps are "real time" so you can compare them to your watch. Reads in a transaction see everything that has been committed before the transaction commits, and writes are seen by everything that starts after the transaction is committed.

For example, consider the execution of two transactions as illustrated in the diagram below:

timeline that shows the execution of two transactions that read the same data

Transaction Txn1 in blue reads some data A, buffers a write to A, then successfully commits. Transaction Txn2 in green starts after Txn1, reads some data B, then reads the data A. Since Txn2 reads the value of A after Txn1 committed its write to A, Txn2 sees the effect of Txn1's write to A, even though Txn2 started before Txn1 completed.

Even though there is some overlap in time in which Txn1 and Txn2 are both executing, their commit timestamps c1 and c2 respect a linear transaction order, which means that all effects of the reads and writes of Txn1 appear to have occurred at a single point of time (c1), and all effects of the reads and writes of Txn2 appear to have occurred at a single point of time (c2). Furthermore, c1 < c2 (which is guaranteed because both Txn1 and Txn2 committed writes; this is true even if the writes happened on different machines), which respects the order of Txn1 happening before Txn2. (However, if Txn2 only did reads in the transaction, then c1 <= c2).

Reads observe a prefix of the commit history; if a read sees the effect of Txn2, it also sees the effect of Txn1. All transactions that commit successfully have this property.

Read and write guarantees

If a call to run a transaction fails, the read and write guarantees you have depend on what error the underlying commit call failed with.

For example, an error of "Row Not Found", "Row Already Exists", etc. means that writing the buffered mutations encountered some error, e.g. a row that the client is trying to update doesn't exist. In that case, the reads are guaranteed consistent, the writes are not applied, and the non-existence of the row is guaranteed to be consistent with the reads as well.

Cancelling transaction operations

Asynchronous read operations may be cancelled any time by the user (e.g., when a higher level operation is cancelled or you decide to stop a read based on the initial results received from the read) without affecting any other existing operations within the transaction.

However, even if you have attempted to cancel the read, Cloud Spanner does not guarantee that the read is actually cancelled. After you request the cancellation of a read, that read can still successfully complete or fail with some other reason (e.g. Abort). Furthermore, that cancelled read might actually return some results to you, and those possibly incomplete results will be validated as part of the transaction Commit.

Note that unlike reads, cancelling a transaction Commit operation will result in aborting the transaction (unless the transaction has already Committed or failed with another reason).

Performance

Locking

Cloud Spanner allows multiple clients to simultaneously interact with the same database. In order to ensure the consistency of multiple concurrent transactions, Cloud Spanner uses a combination of shared locks and exclusive locks to control access to the data. When you perform a read as part of a transaction, Cloud Spanner acquires shared read locks, which allows other reads to still access the data until your transaction is ready to commit. When your transaction is committing and writes are being applied, the transaction attempts to upgrade to a exclusive lock. It blocks new shared read locks on the data, waits for existing shared read locks to clear, then places a exclusive lock for exclusive access to the data.

Notes about locks:

  • Locks are taken at the granularity of row-and-column. If transaction T1 has locked column "A" of row "foo", and transaction T2 wants to write column "B" of row "foo" then there is no conflict.
  • Writes to a data item that don't also read the data being written (aka "blind writes") don't conflict with other blind writers of the same item (the commit timestamp of each write determines the order in which it is applied to the database). A consequence of this is that Cloud Spanner only needs to upgrade to an exclusive lock if you have read the data you are writing. Otherwise Cloud Spanner uses a shared lock called a writer shared lock.

Deadlock detection

Cloud Spanner detects when multiple transactions might be deadlocked, and forces all but one of the transactions to abort. For example, consider the following scenario: transaction Txn1 holds a lock on record A and is waiting for a lock on record B, and Txn2 holds a lock on record B and is waiting for a lock on record A. The only way to make progress in this situation is to abort one of the transactions so it releases its lock, allowing the other transaction to proceed.

Cloud Spanner uses the standard "wound-wait" algorithm to handle deadlock detection: under the hood Cloud Spanner keeps track of the age of each transaction that requests conflicting locks, and allows older transactions to abort younger transactions (where "older" means the transaction started sooner). By giving priority to older transactions, Cloud Spanner ensures that eventually every transaction has a chance to acquire locks by virtue of it becoming old enough to have higher priority than other transactions. For example, a transaction holding a reader shared lock can be aborted by an older transaction wanting a writer shared lock.

Distributed transactions

Cloud Spanner is powerful enough to do distributed transactions, which are transactions that touch many parts of the database, even if they are on different servers. Fundamentally this power comes at a performance cost compared to single-site transactions.

What are some high level guidelines about when a transaction might be distributed? Under the hood, Cloud Spanner may divide responsibility for rows in the database across many servers. A row and the corresponding rows in interleaved tables are usually served by the same server, as are two rows in the same table with nearby keys. Cloud Spanner can perform transactions across rows on different servers; however, as a rule of thumb transactions that affect many co-located rows are faster and cheaper than transactions that affect many rows scattered throughout the database, or throughout a large table.

Here are some other performance optimizations to consider: make sure that each transaction contains only the reads and writes that should be applied atomically, and that they don't include those that could be applied with a single read call or a write or in a separate transaction.

Read-only transactions

In addition to locking read-write transactions, Cloud Spanner offers read-only transactions.

Use a read-only transaction when you need to execute more than one read at the same timestamp. If you can express your read using one of Cloud Spanner's single read methods, you should use that single read method instead. The performance of using such a single read call should be comparable to the performance of a single read done in a read-only transaction.

Because read-only transactions don't write, they don't hold locks and they don't block other transactions. Read-only transactions observe a consistent prefix of the transaction commit history, so your application always gets consistent data.

Properties

A Cloud Spanner read-only transaction executes a set of reads at a single logical point in time, both from the perspective of the read-only transaction itself and from the perspective of other readers and writers to the Cloud Spanner database. This means that read-only transactions always observe a consistent state of the database at a chosen point in the transaction history.

Interface

Cloud Spanner provides an interface for executing a body of work in the context of a read-only transaction, with retries for transaction aborts.

Example

The following shows how use a read-only transaction to get consistent data for two reads at the same timestamp:

Go

func readOnlyTransaction(ctx context.Context, w io.Writer, client *spanner.Client) error {
	ro := client.ReadOnlyTransaction()
	defer ro.Close()
	stmt := spanner.Statement{SQL: `SELECT SingerId, AlbumId, AlbumTitle FROM Albums`}
	iter := ro.Query(ctx, stmt)
	defer iter.Stop()
	for {
		row, err := iter.Next()
		if err == iterator.Done {
			break
		}
		if err != nil {
			return err
		}
		var singerID int64
		var albumID int64
		var albumTitle string
		if err := row.Columns(&singerID, &albumID, &albumTitle); err != nil {
			return err
		}
		fmt.Fprintf(w, "%d %d %s\n", singerID, albumID, albumTitle)
	}

	iter = ro.Read(ctx, "Albums", spanner.AllKeys(), []string{"SingerId", "AlbumId", "AlbumTitle"})
	defer iter.Stop()
	for {
		row, err := iter.Next()
		if err == iterator.Done {
			return nil
		}
		if err != nil {
			return err
		}
		var singerID int64
		var albumID int64
		var albumTitle string
		if err := row.Columns(&singerID, &albumID, &albumTitle); err != nil {
			return err
		}
		fmt.Fprintf(w, "%d %d %s\n", singerID, albumID, albumTitle)
	}
}

Java

static void readOnlyTransaction(DatabaseClient dbClient) {
  // ReadOnlyTransaction must be closed by calling close() on it to release resources held by it.
  // We use a try-with-resource block to automatically do so.
  try (ReadOnlyTransaction transaction = dbClient.readOnlyTransaction()) {
    ResultSet queryResultSet =
        transaction.executeQuery(
            Statement.of("SELECT SingerId, AlbumId, AlbumTitle FROM Albums"));
    while (queryResultSet.next()) {
      System.out.printf(
          "%d %d %s\n",
          queryResultSet.getLong(0), queryResultSet.getLong(1), queryResultSet.getString(2));
    }
    ResultSet readResultSet =
        transaction.read(
            "Albums", KeySet.all(), Arrays.asList("SingerId", "AlbumId", "AlbumTitle"));
    while (readResultSet.next()) {
      System.out.printf(
          "%d %d %s\n",
          readResultSet.getLong(0), readResultSet.getLong(1), readResultSet.getString(2));
    }
  }
}

Node.js

// Imports the Google Cloud client library
const Spanner = require('@google-cloud/spanner');

// Instantiates a client
const spanner = Spanner();

// Uncomment these lines to specify the instance and database to use
// const instanceId = 'my-instance';
// const databaseId = 'my-database';

// Gets a reference to a Cloud Spanner instance and database
const instance = spanner.instance(instanceId);
const database = instance.database(databaseId);

// Gets a transaction object that captures the database state
// at a specific point in time
database.runTransaction()
  .then((results) => {
    const transaction = results[0];

    const queryOne = 'SELECT SingerId, AlbumId, AlbumTitle FROM Albums';

    // Read #1, using SQL
    transaction.run(queryOne)
      .then((results) => {
        const rows = results[0];

        rows.forEach((row) => {
          const json = row.toJSON();
          console.log(`SingerId: ${json.SingerId.value}, AlbumId: ${json.AlbumId.value}, AlbumTitle: ${json.AlbumTitle}`);
        });
      });

    const queryTwo = {
      columns: ['SingerId', 'AlbumId', 'AlbumTitle'],
      keySet: {
        all: true
      }
    };

    // Read #2, using the `read` method. Even if changes occur
    // in-between the reads, the transaction ensures that both
    // return the same data.
    transaction.read('Albums', queryTwo)
      .then((results) => {
        const rows = results[0];

        rows.forEach((row) => {
          const json = row.toJSON();
          console.log(`SingerId: ${json.SingerId.value}, AlbumId: ${json.AlbumId.value}, AlbumTitle: ${json.AlbumTitle}`);
        });
      });
  })
  .then(() => {
    console.log('Successfully executed read-only transaction.');
  });

Python

def read_only_transaction(instance_id, database_id):
    """Reads data inside of a read-only transaction.

    Within the read-only transaction, or "snapshot", the application sees
    consistent view of the database at a particular timestamp.
    """
    spanner_client = spanner.Client()
    instance = spanner_client.instance(instance_id)
    database = instance.database(database_id)

    with database.snapshot() as snapshot:
        # Read using SQL.
        results = snapshot.execute_sql(
            'SELECT SingerId, AlbumId, AlbumTitle FROM Albums')

        print('Results from first read:')
        for row in results:
            print(u'SingerId: {}, AlbumId: {}, AlbumTitle: {}'.format(*row))

        # Perform another read using the `read` method. Even if the data
        # is updated in-between the reads, the snapshot ensures that both
        # return the same data.
        keyset = spanner.KeySet(all_=True)
        results = snapshot.read(
            table='Albums',
            columns=('SingerId', 'AlbumId', 'AlbumTitle',),
            keyset=keyset,)

        print('Results from second read:')
        for row in results:
            print(u'SingerId: {}, AlbumId: {}, AlbumTitle: {}'.format(*row))

Example of executing a read-only transaction

Send feedback about...

Cloud Spanner Documentation