Primary key default values management

This page discusses strategies to use to generate primary key values in your table using default value expressions. Information on this page applies to both GoogleSQL-dialect databases and PostgreSQL-dialect databases. These strategies have the following benefits:

  • Prevent hotspotting
  • Simplify migrations from other databases
  • Encapsulate the key logic in the database so that you don't need to worry about managing it in your application

You can use the following strategies in a column that hasDEFAULT expressions:

  • A UUID function that generates UUID Version 4 values.
  • A schema object, SEQUENCE, that has a bit_reversed_positive option. SEQUENCE is available for both GoogleSQL and PostgreSQL.

Methods to automatically generate primary keys

This section describes how to automatically generate UUID and bit-reversed sequences to use as primary keys values.

Universally Unique Identifier (UUID)

Spanner can automatically generate a UUID Version 4 string to use as a primary key. UUIDs work well for new applications and tables with many rows. They are roughly uniformly distributed across the key space which prevents hotspotting at scale. UUID generation can create a large number of values (2122) and each value is effectively unique. For example, you would need 2.71×1018 values for a 50% probability of collision, or 1 billion per second for 86 years. This ensures unique values when you use it in large tables. UUIDs are unique whether you generate them in the database or the client. We recommend that you use UUIDs when possible. You can safely mix client- and Spanner-generated UUIDs in the same table if the client-generated UUIDs are serialized as lower case, in accordance with RFC 4122.

For a column that needs default values, you can use the GENERATE_UUID function to generate them. The following example shows how to create a table where the FanId key column has GENERATE_UUID in the value column as its default value. The example uses 36 characters for the GoogleSQL STRING and PostgreSQL varchar attributes because UUIDs have 36 characters. When you use the INSERT with THEN RETURN statement to insert into the Fans table, GENERATE_UUID generates and returns a UUID value for FanId.

GoogleSQL

CREATE TABLE Fans (
  FanId STRING(36) DEFAULT (GENERATE_UUID()),
  Name STRING(MAX),
) PRIMARY KEY (FanId);

PostgreSQL

CREATE TABLE Fans (
  FanId varchar(36) DEFAULT spanner.generate_uuid(),
  Name text,
  PRIMARY KEY (FanId)
);

GoogleSQL

INSERT INTO Fans (Name) VALUES ('Melissa Garcia')
THEN RETURN FanId;

PostgreSQL

INSERT INTO fans (name) VALUES ('Melissa Garcia')
RETURNING (fanid);

This statement returns a result similar to the following:

FanId
6af91072-f009-4c15-8c42-ebe38ae83751

For more information about the GENERATE_UUID() function, see the GoogleSQL or PostgreSQL reference page.

IDENTITY columns

With IDENTITY columns, you can automatically generate integer values for key and non-key columns. IDENTITY columns don't require users to manually maintain an underlying sequence, or manage the relationship between the column and the underlying sequence. When an auto-generated identity column is dropped, the underlying sequence is automatically deleted as well.

You can use IDENTITY columns by either providing a starting integer value when generating the sequence, or letting Spanner generate the integer sequence for you. To provide a starting integer value, you must use the START COUNTER WITH option and use a positive INT64 starting value. Spanner uses this value to set the next value for its auto-generated internal sequence counter and bit-reverses the value before inserting it into the this column.

In Spanner, IDENTITY columns are supported in both GoogleSQL and PostgreSQL.

GoogleSQL

The following example shows how to use IDENTITY columns to create an auto-generated integer primary key column for SingerId when creating a new table using the CREATE TABLE command:

CREATE TABLE Singers (
  SingerId INT64 GENERATED BY DEFAULT AS IDENTITY (BIT_REVERSED_POSITIVE),
  Name STRING(MAX),
  Rank INT64
) PRIMARY KEY (SingerId);

You can also specify the counter start for the column using the START_WITH_COUNTER option. In the following example, an auto-generated integer column is created for SingerId that has bit-reversed positive values and an internal counter that starts at 1,000.

CREATE TABLE Singers (
  SingerId INT64 GENERATED BY DEFAULT AS IDENTITY (BIT_REVERSED_POSITIVE START COUNTER WITH 1000),
  Name STRING(MAX),
  Rank INT64
) PRIMARY KEY (SingerId);

PostgreSQL

The following example shows how to use IDENTITY columns to create an auto-generated integer column for SingerId when creating a new table using the CREATE TABLE command:

CREATE TABLE Singers (
  SingerId bigint GENERATED BY DEFAULT AS IDENTITY (BIT_REVERSED_POSITIVE),
  Name text,
  PRIMARY KEY (SingerId)
);

You can also specify the counter start for the column using the START COUNTER WITH option. In the following example, an auto-generated integer column is created for SingerId which generates bit-reversed positive values and the internal counter, before bit-reversing, starts from 1,000.

CREATE TABLE Singers (
  SingerId bigint GENERATED BY DEFAULT AS IDENTITY (BIT_REVERSED_POSITIVE START COUNTER WITH 1000),
  Name text,
  PRIMARY KEY (SingerId)
);

Bit-reversed sequence

A bit-reversed sequence is a schema object that produces a sequence of integers and bit-reverses them. This object uses bit reversal on a private, internal Spanner counter to ensure uniqueness. The resulting bit-reversed values helps avoid hotspotting at scale when used in a primary key.

In Spanner, you use SEQUENCE DDL statements along with the bit_reversed_positive attribute to create, alter, or drop a sequence that produces bit-reversed positive values (GoogleSQL or PostgreSQL).

Each sequence maintains a set of internal counters and uses them to generate a value. The sequence counter provides the input to the bit-reversing algorithm.

When you define a column with a DEFAULT expression that uses the GoogleSQL GET-NEXT-SEQUENCE-VALUE or the PostgreSQL nextval function as its default value, Spanner automatically calls the function and puts the bit-reversed output values into the column. Bit-reversed sequences are especially useful for primary keys, because bit-reversed values are evenly distributed across the key space so that they don't cause hotspotting.

The following example shows how to create a bit-reversed sequence and a table where its key column uses the sequence as the default value:

GoogleSQL

CREATE SEQUENCE SingerIdSequence OPTIONS (
  sequence_kind="bit_reversed_positive"
);

CREATE TABLE Singers (
  SingerId INT64 DEFAULT (GET_NEXT_SEQUENCE_VALUE(SEQUENCE SingerIdSequence)),
  Name STRING(MAX),
  Rank INT64,
) PRIMARY KEY (SingerId);

PostgreSQL

CREATE SEQUENCE SingerIdSequence bit_reversed_positive;

CREATE TABLE Singers (
  SingerId bigint DEFAULT nextval('SingerIdSequence'),
  Name text,
  PRIMARY KEY (SingerId)
);

You can then use the following SQL statement to insert and return the primary key value:

GoogleSQL

INSERT INTO Singers (Name) VALUES ('Melissa Garcia')
THEN RETURN SingerId;

PostgreSQL

INSERT INTO Singers (name) VALUES ('Melissa Garcia')
RETURNING (SingerId);

This statement returns a result similar to the following:

SingerId
3458764513820540928

Scenarios for using UUIDs and sequences as default values for primary keys

The scenarios for UUIDs and sequences include the following:

  • New applications
  • Migrations

The following sections describe each scenario.

New applications

If your existing application requires INT64 keys in GoogleSQL, or bigint keys in PostgreSQL, Spanner offers the bit-reversed positive sequence schema object (PostgreSQL or GoogleSQL). Otherwise, for new applications, we recommend that you use Universally Unique Identifier (UUID). For more information, see Use a Universally Unique Identifier (UUID).

Migrations

For migrations of tables to Spanner, you have a few options:

  • If you are using UUIDs in your source database, on Spanner, you can use a key column in the STRING type and the GENERATE_UUID() function (GoogleSQL or PostgreSQL) as its default value.
  • If you are using an integer primary key, and your application only needs the key to be unique, you can use a key column in INT64 and use a bit-reversed positive sequence for the default value for the primary key. See Migrating bit-reversed key columns.
  • Spanner doesn't support a way to generate monotonic values.

    If you're using a monotonic key, such as the PostgreSQL SERIAL type, or MySQL AUTO_INCREMENT attribute, and you need new monotonic keys on Spanner, you can use a composite key. See Swap the order of keys and Hash the unique key and spread the writes across logical shards.

  • If your application is manually bit-reversing your INT64 key in GoogleSQL or bigint key in PostgreSQL, you can use a bit-reversed positive sequence (GoogleSQL or PostgreSQL) and have it generate new key values for you. See Migrating bit-reversed key columns.

What's next