This page describes foreign keys in Spanner, and how you can use them to enforce referential integrity in your database.
Foreign keys define relationships between tables. Spanner ensures that the data integrity of these relationships is maintained.
Imagine you're a lead developer for an ecommerce business. You are designing a database to process customer orders. The database must store information about each order, customer, and product. Figure 1 illustrates the basic database structure for the application.
Figure 1. Diagram of an order processing database
You define a Customers
table to store customer information, an Orders
table
to track all orders, and a Products
table to store information about each
product.
Figure 1 also shows links between the tables that map to the following real-world relationships:
A customer places an order.
An order is placed for a product.
You decide that your database enforces the following rules to ensure that orders in your system are valid.
You can't create an order for a customer that doesn't exist.
A customer can't place an order for a product you don't carry.
When you enforce these rules, or constraints, you're maintaining the referential integrity of your data. When a database maintains referential integrity, all attempts to add invalid data, which would result in invalid links or references between data, fail. Referential integrity prevents user errors. Spanner enforces referential integrity through foreign keys.
Enforce referential integrity with foreign keys
The following examines the order processing example again, with more detail added to the design, as shown in Figure 2.
Figure 2. Diagram of a database schema with foreign keys
The design now shows column names and types in each table. The Orders
table
also defines two foreign key relationships. FK_CustomerOrder
ensures that all
rows in Orders
have a valid CustomerID
. The FK_ProductOrder
foreign key
ensures that all ProductID
values in the Orders
table are valid. The
following table maps these constraints back to the real-world rules that you
want to enforce.
Foreign Key Name | Constraint | Real-world description |
---|---|---|
FK_CustomerOrder | Ensures that all rows in Orders have a valid
CustomerID |
A valid customer places an order |
FK_ProductOrder | Ensures that all rows in Orders have a valid
ProductID |
An order was placed for a valid product |
Spanner fails any transaction that attempts to insert or update a
row in the Orders
table that has a CustomerID
or ProductID
not found in
the Customers
and Products
tables. It also fails transactions that attempt
to update or delete rows in the Customers
and Products
tables that would
invalidate the IDs in the Orders
table. For more details about how
Spanner validates constraints, refer to the Transaction constraint
validation section.
Foreign key characteristics
The following is a list of characteristics of foreign keys in Spanner.
The table that defines the foreign key is the referencing table, and the foreign key columns are the referencing columns.
The foreign key references the referenced columns of the referenced table.
As in the example, you can name each foreign key constraint. If you don't specify a name, Spanner generates a name for you. You can query the generated name from Spanner's
INFORMATION_SCHEMA
. Constraint names are scoped to the schema, along with the names for tables and indexes, and must be unique within the schema.The number of referencing and referenced columns must be the same. Order is important. The first referencing column refers to the first referenced column, the second to the second, etc.
A referencing column and its referenced counterpart must be the same type. You must be able to index the columns.
You can't create foreign keys on columns with the
allow_commit_timestamp=true
option.Array columns are not supported.
JSON columns are not supported.
A foreign key can reference columns of the same table (a self-referencing foreign key). An example is an
Employee
table with aManagerId
column that references the table'sEmployeeId
column.Foreign keys can also form circular relationships between tables where two tables reference each other, either directly or indirectly. The referenced table must exist before creating a foreign key. This means that at least one of the foreign keys must be added using the
ALTER TABLE
statement.The referenced keys must be unique. Spanner uses the
PRIMARY KEY
of the referenced table if the foreign key's referenced columns match the referenced table's primary key columns. If Spanner can't use the referenced table's primary key, it creates aUNIQUE NULL_FILTERED INDEX
over the referenced columns.Spanner can also use the primary key of the referencing table, although this is less common. If not, Spanner creates a
NULL_FILTERED INDEX
over the referencing columns.Foreign keys don't use secondary indexes that you have created. Instead, they create their own backing indexes. Backing indexes are usable in query evaluations, including in explicit
force_index
directives. You can query the names of the backing indexes from Spanner'sINFORMATION_SCHEMA
. For more information, see Backing indexes.
Define foreign keys
You create and remove foreign keys from your Spanner database
using DDL. You add foreign keys to a new table with the CREATE TABLE
statement. Similarly, you add a foreign key to, or remove a foreign key from, an
existing table with the ALTER TABLE
statement. The following is an
example of creating a new table with a foreign key.
GoogleSQL
CREATE TABLE Orders (
OrderID INT64 NOT NULL,
CustomerID INT64 NOT NULL,
Quantity INT64 NOT NULL,
ProductID INT64 NOT NULL,
CONSTRAINT FK_CustomerOrder FOREIGN KEY (CustomerID) REFERENCES Customers (CustomerID)
) PRIMARY KEY (OrderID);
PostgreSQL
CREATE TABLE Orders (
OrderID BIGINT NOT NULL,
CustomerID BIGINT NOT NULL,
Quantity BIGINT NOT NULL,
ProductID BIGINT NOT NULL,
CONSTRAINT FK_CustomerOrder FOREIGN KEY (CustomerID) REFERENCES Customers (CustomerID),
PRIMARY KEY (OrderID)
);
For more examples of how to create and manage foreign keys, see Create and manage foreign key relationships.
Foreign key actions
Foreign key actions control what happens to the constrained column when the
column it references is deleted or updated. Spanner supports the
use of the ON DELETE CASCADE
action. With the foreign key ON DELETE CASCADE
action, when you delete a row that contains a referenced foreign key, all rows
that reference that key are also deleted in the same transaction.
You can add a foreign key with an action when you create your database using
DDL. Use the CREATE TABLE
statement to add foreign keys with an action to
a new table. Similarly, you can use the ALTER TABLE
statement to add a
foreign key action to an existing table or to remove a foreign key action. The
following is an example of how to create a new table with a foreign key action.
GoogleSQL
CREATE TABLE Customers (
CustomerId INT64 NOT NULL,
CustomerName STRING(MAX) NOT NULL,
) PRIMARY KEY(CustomerId);
CREATE TABLE ShoppingCarts (
CartId INT64 NOT NULL,
CustomerId INT64 NOT NULL,
CustomerName STRING(MAX) NOT NULL,
CONSTRAINT FKShoppingCartsCustomers FOREIGN KEY(CustomerId, CustomerName)
REFERENCES Customers(CustomerId, CustomerName) ON DELETE CASCADE,
) PRIMARY KEY(CartId);
PostgreSQL
CREATE TABLE Customers (
CustomerId bigint NOT NULL,
CustomerName character varying(1024) NOT NULL,
PRIMARY KEY(CustomerId)
);
CREATE TABLE ShoppingCarts (
CartId bigint NOT NULL,
CustomerId bigint NOT NULL,
CustomerName character varying(1024) NOT NULL,
PRIMARY KEY(CartId),
CONSTRAINT fkshoppingcartscustomers FOREIGN KEY (CustomerId, CustomerName)
REFERENCES Customers(CustomerId, CustomerName) ON DELETE CASCADE
);
The following is a list of characteristics of foreign key actions in Spanner.
Foreign key actions are either
ON DELETE CASCADE
orON DELETE NO ACTION
.You can query the
INFORMATION_SCHEMA
to find foreign key constraints that have an action.Adding a foreign key action on an existing foreign key constraint isn't supported. You must add a new foreign key constraint with an action.
Long-running schema changes
Adding a foreign key to an existing table, or creating a new table with a foreign key, can lead to long-running operations. In the case of a new table, the table isn't writable until the long-running operation is complete.
For a new table with a foreign key, Spanner needs to backfill the referenced indexes as needed for each foreign key.
For an existing table with a foreign key, Spanner needs to backfill the referencing and referenced indexes as needed. In addition, Spanner validates existing data in the table to ensure that it complies with the foreign key's referential integrity constraint. The schema change fails if any data is invalid.
Adding a foreign key action on an existing constraint isn't supported. You should:
- Add a new constraint with the required action.
- Drop the older constraint that doesn't have the action.
This avoids a Long-running Alter Constraint Operation
issue. After you create
the new foreign key with the ON DELETE CASCADE
action, the net effect of both
constraints is DELETE CASCADE
. Dropping a constraint might lead to dropping
the foreign key backing indexes if the indexes aren't in use in other foreign
key constraints. Later, if the user adds the same foreign key constraint with
action, it might require long-running operations, such as backfilling indexes,
validating unique index constraints, and validating foreign key referential
constraints.
Either of the previous schema changes can fail if the referenced index can't be
created due to a UNIQUE
constraint violation.
You can query INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS.SPANNER_STATE
to
check foreign key creation state.
Constraint validation
Spanner validates foreign key constraints as a transaction is committed, or as the effects of writes are made visible to subsequent operations in the transaction.
A value inserted into the referencing column(s) is matched against the values of
the referenced table and referenced columns. Rows with NULL
referencing values
aren't checked, which means that you can add them to the referencing table.
Spanner validates all applicable foreign key referential constraints when attempting to update data using either DML statements or an API. All pending changes are rolled back if any constraints are invalid.
Validation occurs immediately after each DML statement. For example, you must insert the referenced row before inserting its referencing rows. When using a mutation API, mutations are buffered until the transaction is committed. Foreign key validation is deferred until the transaction is committed. In this case, it is permissible to insert the referencing rows first.
Each transaction is evaluated for modifications that affect foreign key constraints. These evaluations might require additional requests to the server. Backing indexes also require additional processing time to evaluate transaction modifications and to maintain the indexes. Additional storage is also required for each index.
Backing indexes
Foreign keys don't use user-created indexes. They create their own backing indexes.
Spanner can create up to two secondary backing indexes for each foreign key, one for the referencing columns, and a second for the referenced columns. However, a foreign key usually references the primary keys of the referenced table, so the second index on the referenced table is typically not needed.
The backing index for the referenced table is a UNIQUE NULL_FILTERED
index.
The creation of the foreign key fails if any existing data violates the index's
uniqueness constraint. The backing index for the referencing table is
NULL_FILTERED
.
If two or more foreign keys require the same backing index, Spanner creates a single index for all of them. The backing indexes are dropped when the foreign keys using them are dropped. Users can't alter or drop the backing indexes.
Spanner uses each database's information schema to store
metadata about backing indexes. Rows within INFORMATION_SCHEMA.INDEXES
that have a SPANNER_IS_MANAGED
value of true
describe backing indexes.
Outside of SQL queries that directly invoke the information schema, the Google Cloud console doesn't display any information about a database's backing indexes.
Long-running delete cascade action
When you delete a row from a referenced table, Spanner must delete all rows in the referencing tables that reference the deleted row. This can lead to a cascading effect, where a single delete operation results in thousands of other delete operations. Adding a foreign key constraint with delete cascade action to a table or creating a table with foreign key constraints with delete cascade action can slow down delete operations.
Mutation limit exceeded for foreign key delete cascade
Deleting a large number of records using a foreign key delete cascade can impact performance. This is because each deleted record invokes the deletion of all records related to it. If you need to delete a large number of records using a foreign key delete cascade, you should explicitly delete the rows from the child tables before deleting the row from the parent tables. This prevents the transaction from failing due to the mutation limit.
Comparison of foreign keys and table interleaving
Spanner's table interleaving is a good choice for many parent-child relationships where the child table's primary key includes the parent table's primary key columns. The co-location of child rows with their parent rows can significantly improve performance.
Foreign keys are a more general parent-child solution and address additional use cases. They're not limited to primary key columns, and tables can have multiple foreign key relationships, both as a parent in some relationships and a child in others. However, a foreign key relation does not imply co-location of the tables in the storage layer.
Consider an example that uses an Orders
table that's defined as follows:
Figure 3. Diagram of our database schema with foreign keys
The design in Figure 3 has some limitations. For example, each order can contain only one order item.
Imagine that your customers want to be able to order more than one product per
order. You can enhance your design by introducing an OrderItems
table that
contains an entry for each product the customer ordered. You can introduce
another foreign key to represent this new one-to-many relationship between
Orders
and OrderItems
. However, you also know that you often want to run
queries across orders and their respective order items. Because co-location of
this data boosts performance, you would want to create the parent-child
relationship using Spanner's table interleaving capability.
Here's how you define the OrderItems
table, interleaved with Orders
.
GoogleSQL
CREATE TABLE OrderItems (
OrderID INT64 NOT NULL,
ProductID INT64 NOT NULL,
Quantity INT64 NOT NULL,
FOREIGN KEY (ProductID) REFERENCES Products (ProductID)
) PRIMARY KEY (OrderID, ProductID),
INTERLEAVE IN PARENT Orders ON DELETE CASCADE;
PostgreSQL
CREATE TABLE OrderItems (
OrderID BIGINT NOT NULL,
ProductID BIGINT NOT NULL,
Quantity BIGINT NOT NULL,
FOREIGN KEY (ProductID) REFERENCES Products (ProductID),
PRIMARY KEY (OrderID, ProductID)
) INTERLEAVE IN PARENT Orders ON DELETE CASCADE;
Figure 4 is a visual representation of the updated database schema as a result
of introducing this new table, OrderItems
, interleaved with Orders
. Here you
can also see the one-to-many relationship between those two tables.
Figure 4. Addition of an interleaved OrderItems table
In this configuration, you can have multiple OrderItems
entries in each order,
and the OrderItems
entries for each order are interleaved, and therefore
co-located with the orders. Physically interleaving Orders
and OrderItems
in
this way can improve performance, effectively pre-joining the tables and letting
you access related rows together while minimizing disk accesses. For example,
Spanner can perform joins by primary key locally, minimizing disk
access and network traffic.
If the number of mutations in a transaction exceeds 80,000, the transaction fails. Such large cascading deletes work well for tables with an "interleaved in parent" relationship, but not for tables with a foreign key relationship. If you have a foreign key relationship and you need to delete a large number of rows, you should explicitly delete the rows from the child tables first.
If you have a user table with a foreign key relationship to another table, and deleting a row from the referenced table triggers the deletion of millions of rows, you should design your schema with a delete cascade action with "interleaved in parent".
Comparison table
The following table summarizes how foreign keys and table interleaving compare. You can use this information to decide what is right for your design.
Parent-child relationship type | Table Interleaving | Foreign Keys |
---|---|---|
Can use primary keys | Yes | Yes |
Can use non-primary-key columns | No | Yes |
Number of parents supported | 0 .. 1 | 0 .. N |
Stores parent and child data together | Yes | No |
Supports cascade delete | Yes | Yes |
Null matching mode | Passes if all referencing values are not distinct from the referenced values. Null values are not distinct from null values; null values are distinct from non-null values. |
Passes if any referencing values are null. Passes if all referencing values are non-null, and the referenced table has a row with values equal to the referencing values. Fails if no matching row was found. |
Enforcement Timing | Per operation when using the mutation API. Per statement when using DML. |
Per transaction when using the mutation API. Per statement when using DML. |
Can be removed | No. Table interleaving can't be removed after it's created, unless you delete the whole child table. | Yes |
What's next
Learn about Creating and managing foreign key relationships.
Learn more about the information schema.