This document describes foreign keys in Spanner, and how you can use them to enforce referential integrity in your database. The following topics help you learn about foreign keys and how to use them:
- Overview of foreign keys in Spanner
- Types of foreign keys
- Comparison of foreign key types
- Choose which foreign key type to use
- Use enforced foreign keys
- Use informational foreign keys
- Backing indexes
- Long-running schema changes
Overview of foreign keys in Spanner
Foreign keys define relationships between tables. You can use foreign keys to make sure that the data integrity of these relationships in Spanner is maintained.
Imagine you're a lead developer for an ecommerce business. You are designing a database to process customer orders. The database must store information about each order, customer, and product. Figure 1 illustrates the basic database structure for the application.
Figure 1. Diagram of an order processing database
You define a Customers
table to store customer information, an Orders
table
to track all orders, and a Products
table to store information about each
product.
Figure 1 also shows links between the tables that map to the following real-world relationships:
A customer places an order.
An order is placed for a product.
You decide that your database enforces the following rules to ensure that orders in your system are valid.
You can't create an order for a customer that doesn't exist.
A customer can't place an order for a product you don't carry.
When you enforce these rules, or constraints, you're maintaining the referential integrity of your data. When a database maintains referential integrity, all attempts to add invalid data, which would result in invalid links or references between data, fail. Referential integrity prevents user errors. By default, Spanner uses foreign keys to enforce referential integrity.
Define referential integrity with foreign keys
The following examines the order processing example again, with more detail added to the design, as shown in Figure 2.
Figure 2. Diagram of a database schema with foreign keys
The design now shows column names and types in each table. The Orders
table
also defines two foreign key relationships. FK_CustomerOrder
expects that all
rows in Orders
have a valid CustomerId
. The FK_ProductOrder
foreign key
expects that all ProductId
values in the Orders
table are valid. The
following table maps these constraints back to the real-world rules that you
want to enforce.
Foreign Key Name | Constraint | Real-world description |
---|---|---|
FK_CustomerOrder | Expects that all rows in Orders have a valid
CustomerId |
A valid customer places an order |
FK_ProductOrder | Expects that all rows in Orders have a valid
ProductId |
An order was placed for a valid product |
Spanner enforces constraints that are specified using
enforced foreign keys. This means that
Spanner fails any transaction that attempts to insert or update a
row in the Orders
table that has a CustomerId
or ProductId
not found in
the Customers
and Products
tables. It also fails transactions that attempt
to update or delete rows in the Customers
and Products
tables that would
invalidate the IDs in the Orders
table. For more details about how
Spanner validates constraints, refer to the Transaction
constraint validation section.
Unlike enforced foreign keys, Spanner doesn't validate
constraints on informational foreign keys. This
means that if you use an informational foreign key in this scenario, then a
transaction that attempts to insert or update a row in the Orders
table that
has a CustomerId
or ProductId
not found in the Customers
and Products
tables isn't validated and the transaction doesn't fail. Also unlike enforced
foreign keys, informational foreign keys are supported by
GoogleSQL only, and not by PostgreSQL.
Foreign key characteristics
The following is a list of characteristics of foreign keys in Spanner.
The table that defines the foreign key is the referencing table, and the foreign key columns are the referencing columns.
The foreign key references the referenced columns of the referenced table.
As in the example, you can name each foreign key constraint. If you don't specify a name, Spanner generates a name for you. You can query the generated name from Spanner's
INFORMATION_SCHEMA
. Constraint names are scoped to the schema, along with the names for tables and indexes, and must be unique within the schema.The number of referencing and referenced columns must be the same. Order is important. For example, the first referencing column refers to the first referenced column and the second referencing column refers to the second referenced column.
A referencing column and its referenced counterpart must be the same type. You must be able to index the columns.
You can't create foreign keys on columns with the
allow_commit_timestamp=true
option.Array columns are not supported.
JSON columns are not supported.
A foreign key can reference columns of the same table (a self-referencing foreign key). An example is an
Employee
table with aManagerId
column that references the table'sEmployeeId
column.Foreign keys can also form circular relationships between tables where two tables reference each other, either directly or indirectly. The referenced table must exist before creating a foreign key. This means that at least one of the foreign keys must be added using the
ALTER TABLE
statement.The referenced keys must be unique. Spanner uses the
PRIMARY KEY
of the referenced table if the referenced columns for a foreign key match the referenced table's primary key columns. If Spanner can't use the referenced table's primary key, it creates aUNIQUE NULL_FILTERED INDEX
over the referenced columns.Foreign keys don't use secondary indexes that you have created. Instead, they create their own backing indexes. Backing indexes are usable in query evaluations, including in explicit
force_index
directives. You can query the names of the backing indexes from Spanner'sINFORMATION_SCHEMA
. For more information, see Backing indexes.
Types of foreign keys
There are two types of foreign keys, enforced and informational. Enforced foreign keys are the default and enforce referential integrity. Informational foreign keys don't enforce referential integrity and are best used to declare the intended logical data model for query optimization. For more details, see the following enforced and informational foreign keys sections and the comparison of foreign key types table.
Enforced foreign keys
Enforced foreign keys, the default foreign key type in Spanner, enforce referential integrity. Because enforced foreign keys enforce referential integrity, they cause attempts to do the following to fail:
Adding a row to a referencing table that has a foreign key value that doesn't exist in the referenced table fails.
Deleting a row from a referenced table that's referenced by rows in the referencing table fails.
All PostgreSQL foreign keys are enforced. GoogleSQL foreign
keys are enforced by default. Because foreign keys are enforced by default,
using the ENFORCED
keyword to specify that a GoogleSQL foreign key
is enforced is optional.
Informational foreign keys
Informational foreign keys are used to declare the intended logical data model for query optimization. While referenced table keys must be unique for informational foreign keys, the referential integrity isn't enforced. If you want to selectively validate referential integrity when you use informational foreign keys, then you need to manage validation logic on the client side. For more information, see Use informational foreign keys.
Use the NOT ENFORCED
keyword to specify that a GoogleSQL foreign key
is informational. PostgreSQL doesn't support informational foreign
keys.
Comparison of foreign key types
Both enforced and informational have benefits. The following sections compare the two types of foreign keys and include some best practices.
High-level foreign key differences
At a high level, the following are some of the differences between enforced and informational foreign keys:
Enforcement. Enforced foreign keys validate and guarantee referential integrity on writes. Informational foreign keys don't validate or guarantee referential integrity.
Storage. Enforced foreign keys might require additional storage for the backing index on the constrained table.
Write throughput. Enforced foreign keys might incur more overhead in the write path than informational foreign keys.
Query optimization. Both types of foreign keys can be used for query optimization. When the optimizer is allowed to use informational foreign keys, query results might not reflect the actual data if the data doesn't match the informational foreign key relationships (for example, if some constrained keys don't have matching referenced keys in the referenced table).
Foreign key differences table
The following table lists detailed differences between enforced and informational foreign keys:
Enforced foreign keys | Informational foreign keys | |
---|---|---|
Keywords | ENFORCED |
NOT ENFORCED |
Supported by GoogleSQL | Yes. Foreign keys in GoogleSQL are enforced by default. | Yes. |
Supported by PostgreSQL | Yes. Foreign keys in PostgreSQL can only be enforced. | No. |
Storage | Enforced foreign keys require storage for up to two backing indexes. | Informational foreign keys require storage for up to one backing index. |
Creates backing indexes on referenced table columns when needed | Yes. | Yes. |
Creates backing indexes on referencing table columns when needed | Yes. | No. |
Foreign key actions support | Yes. | No. |
Validates and enforces referential integrity | Yes. | No. Having no validation improves write performance, but can impact query results when informational foreign keys are used for query optimization. You can use client-side validation or an enforced foreign key to ensure referential integrity. |
Choose which foreign key type to use
You can use the following guidelines to decide which foreign key type to use:
We recommend that you start with enforced foreign keys. Enforced foreign keys keep the data and the logical model consistent at all times. Enforced foreign keys are the recommended option unless they don't work for your use case.
We recommend that you consider informational foreign keys if each of the following is true:
You want to use the logical data model described by informational foreign key in query optimization.
Maintaining strict referential integrity is impractical or impacts performance significantly. The following are examples of when you might want to consider using an informational foreign key:
Your upstream data source follows an eventual-consistency model. In this case, updates made in the source system might not be reflected immediately in Spanner. Because updates might not be immediate, brief inconsistencies in foreign key relationships might occur.
Your data contains referenced rows that have a large number of referencing relationships. Updates to these rows can use a lot of resources because Spanner must validate or, in some cases, delete all rows that are related to maintaining referential integrity. In this scenario, updates might impact Spanner performance and slow down concurrent transactions.
Your application can handle potential data inconsistencies and their impact on query results.
Use informational foreign keys
The following topics are for informational foreign keys only. For topics that apply to both informational and enforced foreign keys, see the following:
Create a new table with an informational foreign key
You create and remove and
informational foreign keys from your
Spanner database using DDL statements. You add foreign keys to a
new table with the CREATE TABLE
statement. Similarly, you can add or
remove foreign keys from an existing table with the ALTER TABLE
statement.
The following example creates a new table with an informational foreign key using GoogleSQL. Informational foreign keys aren't supported by PostgreSQL.
CREATE TABLE Customers (
CustomerId INT64 NOT NULL,
CustomerName STRING(MAX) NOT NULL,
) PRIMARY KEY(CustomerId);
CREATE TABLE Orders (
OrderId INT64 NOT NULL,
CustomerId INT64 NOT NULL,
Quantity INT64 NOT NULL,
ProductId INT64 NOT NULL,
CONSTRAINT FK_CustomerOrder FOREIGN KEY (CustomerId)
REFERENCES Customers (CustomerId) NOT ENFORCED
) PRIMARY KEY (OrderId);
Not Supported
For more examples of how to create and manage foreign keys, see Create and manage foreign key relationships. For more information about DDL statements, see the DDL reference.
Use informational foreign keys for query optimization
Both enforced foreign keys and informational foreign keys can be used by the query optimizer to improve query performance. Using informational foreign keys lets you to take advantage of optimized query plans without the overhead of strict referential integrity enforcement.
If you enable the query optimizer to utilize informational foreign keys information, it's important to understand that the correctness of the optimization depends on having data that's consistent with the logical model described by the informational foreign keys. If inconsistencies exist, then query results might not reflect the actual data. An example of an inconsistency is when a value in constrained column doesn't have a matching value in a referenced column.
By default, the query optimizer uses NOT ENFORCED
foreign keys. To change
this, set the database option
use_unenforced_foreign_key_for_query_optimization
to false. The following is a
GoogleSQL example that demonstrates this (informational foreign keys
aren't available in PostgreSQL):
SET DATABASE OPTIONS (
use_unenforced_foreign_key_for_query_optimization = false
);
The boolean query statement hint @{use_unenforced_foreign_key}
overrides the
database option on a per-query basis that controls whether the optimizer uses
NOT ENFORCED
foreign keys. Disabling this hint or the database option can be
useful when troubleshooting unexpected query results. The following shows how
to use @{use_unenforced_foreign_key}
:
@{use_unenforced_foreign_key=false} SELECT Orders.CustomerId
FROM Orders
INNER JOIN Customers ON Customers.CustomerId = Orders.CustomerId;
Use enforced foreign keys
The following topics are for enforced foreign keys only. For topics that apply to both informational and enforced foreign keys, see the following:
Create a new table with an enforced foreign key
You create and remove and enforced foreign keys from
your Spanner database using DDL. You add foreign keys to a new
table with the CREATE TABLE
statement. Similarly, you add a foreign key
to, or remove a foreign key from, an existing table with the ALTER TABLE
statement.
You create and remove foreign keys from your Spanner database
using DDL. You add foreign keys to a new table with the CREATE TABLE
statement. Similarly, you add a foreign key to, or remove a foreign key from, an
existing table with the ALTER TABLE
statement.
The following is an example of creating a new table with an enforced foreign key.
CREATE TABLE Customers (
CustomerId INT64 NOT NULL,
CustomerName STRING(MAX) NOT NULL,
) PRIMARY KEY(CustomerId);
CREATE TABLE Orders (
OrderId INT64 NOT NULL,
CustomerId INT64 NOT NULL,
Quantity INT64 NOT NULL,
ProductId INT64 NOT NULL,
CONSTRAINT FK_CustomerOrder FOREIGN KEY (CustomerId)
REFERENCES Customers (CustomerId) ENFORCED
) PRIMARY KEY (OrderId);
CREATE TABLE Customers (
CustomerId bigint NOT NULL,
CustomerName character varying(1024) NOT NULL,
PRIMARY KEY(CustomerId)
);
CREATE TABLE Orders (
OrderId BIGINT NOT NULL,
CustomerId BIGINT NOT NULL,
Quantity BIGINT NOT NULL,
ProductId BIGINT NOT NULL,
CONSTRAINT FK_CustomerOrder FOREIGN KEY (CustomerId)
REFERENCES Customers (CustomerId),
PRIMARY KEY (OrderId)
);
For more examples of how to create and manage foreign keys, see Create and manage foreign key relationships.
Foreign key actions
Foreign key actions can be defined on enforced foreign keys only.
Foreign key actions control what happens to the constrained column when the
column it references is deleted or updated. Spanner supports the
use of the ON DELETE CASCADE
action. With the foreign key ON DELETE CASCADE
action, when you delete a row that contains a referenced foreign key, all rows
that reference that key are also deleted in the same transaction.
You can add a foreign key with an action when you create your database using
DDL. Use the CREATE TABLE
statement to add foreign keys with an action to
a new table. Similarly, you can use the ALTER TABLE
statement to add a
foreign key action to an existing table or to remove a foreign key action. The
following is an example of how to create a new table with a foreign key action.
CREATE TABLE ShoppingCarts (
CartId INT64 NOT NULL,
CustomerId INT64 NOT NULL,
CustomerName STRING(MAX) NOT NULL,
CONSTRAINT FKShoppingCartsCustomers FOREIGN KEY(CustomerId, CustomerName)
REFERENCES Customers(CustomerId, CustomerName) ON DELETE CASCADE,
) PRIMARY KEY(CartId);
CREATE TABLE ShoppingCarts (
CartId bigint NOT NULL,
CustomerId bigint NOT NULL,
CustomerName character varying(1024) NOT NULL,
PRIMARY KEY(CartId),
CONSTRAINT fkshoppingcartscustomers FOREIGN KEY (CustomerId, CustomerName)
REFERENCES Customers(CustomerId, CustomerName) ON DELETE CASCADE
);
The following is a list of characteristics of foreign key actions in Spanner.
Foreign key actions are either
ON DELETE CASCADE
orON DELETE NO ACTION
.You can query the
INFORMATION_SCHEMA
to find foreign key constraints that have an action.Adding a foreign key action on an existing foreign key constraint isn't supported. You must add a new foreign key constraint with an action.
Constraint validation
Constraint validation applies to enforced foreign keys only.
Spanner validates enforced foreign key constraints as a transaction is committed, or as the effects of writes are made visible to subsequent operations in the transaction.
A value inserted into the referencing column is matched against the values of
the referenced table and referenced columns. Rows with NULL
referencing values
aren't checked, which means that you can add them to the referencing table.
Spanner validates all applicable enforced foreign key referential constraints when attempting to update data using either DML statements or an API. All pending changes are rolled back if any constraints are invalid.
Validation occurs immediately after each DML statement. For example, you must insert the referenced row before inserting its referencing rows. When using a mutation API, mutations are buffered until the transaction is committed. Enforced foreign key validation is deferred until the transaction is committed. In this case, it is permissible to insert the referencing rows first.
Each transaction is evaluated for modifications that affect enforced foreign key constraints. These evaluations might require additional requests to the server. Backing indexes also require additional processing time to evaluate transaction modifications and to maintain the indexes. Additional storage is also required for each index.
Long-running delete cascade action
When you delete a row from a referenced table, Spanner must delete all rows in the referencing tables that reference the deleted row. This can lead to a cascading effect, where a single delete operation results in thousands of other delete operations. Adding a foreign key constraint with delete cascade action to a table or creating a table with foreign key constraints with delete cascade action can slow down delete operations.
Mutation limit exceeded for foreign key delete cascade
Deleting a large number of records using a foreign key delete cascade can impact performance. This is because each deleted record invokes the deletion of all records related to it. If you need to delete a large number of records using a foreign key delete cascade, explicitly delete the rows from the child tables before deleting the row from the parent tables. This prevents the transaction from failing due to the mutation limit.
Comparison of enforced foreign keys and table interleaving
Spanner's table interleaving is a good choice for many parent-child relationships where the child table's primary key includes the parent table's primary key columns. The co-location of child rows with their parent rows can significantly improve performance.
Foreign keys are a more general parent-child solution and address additional use cases. They're not limited to primary key columns, and tables can have multiple foreign key relationships, both as a parent in some relationships and a child in others. However, a foreign key relation does not imply co-location of the tables in the storage layer.
Consider an example that uses an Orders
table that's defined as follows:
Figure 3. Diagram of the database schema with enforced foreign keys
The design in Figure 3 has some limitations. For example, each order can contain only one order item.
Imagine that your customers want to be able to order more than one product per
order. You can enhance your design by introducing an OrderItems
table that
contains an entry for each product the customer ordered. You can introduce
another enforced foreign key to represent this new one-to-many relationship
between Orders
and OrderItems
. However, you also know that you often want to
run queries across orders and their respective order items. Because co-location
of this data boosts performance, you would want to create the parent-child
relationship using Spanner's table interleaving capability.
Here's how you define the OrderItems
table, interleaved with Orders
.
CREATE TABLE Products (
ProductId INT64 NOT NULL,
Name STRING(256) NOT NULL,
Price FLOAT64
) PRIMARY KEY(ProductId);
CREATE TABLE OrderItems (
OrderId INT64 NOT NULL,
ProductId INT64 NOT NULL,
Quantity INT64 NOT NULL,
FOREIGN KEY (ProductId) REFERENCES Products (ProductId)
) PRIMARY KEY (OrderId, ProductId),
INTERLEAVE IN PARENT Orders ON DELETE CASCADE;
CREATE TABLE Products (
ProductId BIGINT NOT NULL,
Name varchar(256) NOT NULL,
Price float8,
PRIMARY KEY(ProductId)
);
CREATE TABLE OrderItems (
OrderId BIGINT NOT NULL,
ProductId BIGINT NOT NULL,
Quantity BIGINT NOT NULL,
FOREIGN KEY (ProductId) REFERENCES Products (ProductId),
PRIMARY KEY (OrderId, ProductId)
) INTERLEAVE IN PARENT Orders ON DELETE CASCADE;
Figure 4 is a visual representation of the updated database schema as a result
of introducing this new table, OrderItems
, interleaved with Orders
. Here you
can also see the one-to-many relationship between those two tables.
Figure 4. Addition of an interleaved OrderItems table
In this configuration, you can have multiple OrderItems
entries in each order,
and the OrderItems
entries for each order are interleaved, and therefore
co-located with the orders. Physically interleaving Orders
and OrderItems
in
this way can improve performance, effectively pre-joining the tables and letting
you access related rows together while minimizing disk accesses. For example,
Spanner can perform joins by primary key locally, minimizing disk
access and network traffic.
If the number of mutations in a transaction exceeds 80,000, the transaction fails. Such large cascading deletes work well for tables with an "interleaved in parent" relationship, but not for tables with a foreign key relationship. If you have a foreign key relationship and you need to delete a large number of rows, you should explicitly delete the rows from the child tables first.
If you have a user table with a foreign key relationship to another table, and deleting a row from the referenced table triggers the deletion of millions of rows, you should design your schema with a delete cascade action with "interleaved in parent".
Comparison table
The following table summarizes how enforced foreign keys and table interleaving compare. You can use this information to decide what is right for your design.
Parent-child relationship type | Table Interleaving | Enforced foreign keys |
---|---|---|
Can use primary keys | Yes | Yes |
Can use non-primary-key columns | No | Yes |
Number of parents supported | 0 .. 1 | 0 .. N |
Stores parent and child data together | Yes | No |
Supports cascade delete | Yes | Yes |
Null matching mode | Passes if all referencing values are not distinct from the referenced values. Null values are not distinct from null values; null values are distinct from non-null values. |
Passes if any referencing values are null. Passes if all referencing values are non-null, and the referenced table has a row with values equal to the referencing values. Fails if no matching row was found. |
Enforcement Timing | Per operation when using the mutation API. Per statement when using DML. |
Per transaction when using the mutation API. Per statement when using DML. |
Can be removed | No. Table interleaving can't be removed after it's created, unless you delete the whole child table. | Yes |
Backing indexes
Foreign keys don't use user-created indexes. Instead, they create their own backing indexes. Enforced and informational foreign keys create backing indexes differently in Spanner:
For enforced foreign keys, Spanner can create up to two secondary backing indexes for each foreign key, one for the referencing columns, and a second for the referenced columns.
For informational foreign keys, Spanner can create up to one backing index when needed for the referenced columns. Informational foreign keys don't create a backing index for the referencing columns.
For both enforced and informational foreign keys, a foreign key usually
references the primary keys of the referenced table, so an index for the
referenced table is typically not needed. Because of this, informational foreign
keys typically have zero backing indexes. When needed, the backing index created
for the referenced table is a UNIQUE NULL_FILTERED
index. The creation of
the foreign key fails if any existing data violates the index's uniqueness
constraint.
Informational foreign keys don't have a backing index for the referencing
table. For enforced foreign keys, the backing index for the referencing table is
NULL_FILTERED
.
If two or more foreign keys require the same backing index, Spanner creates a single index for each of them. The backing indexes are dropped when the foreign keys using them are dropped. You can't alter or drop the backing indexes.
Spanner uses the information schema of each database to
store metadata about backing indexes. Rows within
INFORMATION_SCHEMA.INDEXES
that have a SPANNER_IS_MANAGED
value of
true
describe backing indexes.
Outside of SQL queries that directly invoke the information schema, the Google Cloud console doesn't display any information about a database's backing indexes.
Long-running schema changes
Adding an enforced foreign key to an existing table, or creating a new table with a foreign key, can lead to long-running operations. In the case of a new table, the table isn't writable until the long-running operation is complete.
The following table shows what happens in Spanner when an enforced and an informational foreign key is in a new or an existing table:
Table type | Enforced foreign key | Informational foreign key |
---|---|---|
New | Spanner backfills referenced indexes as needed for each foreign key. | Spanner backfills referenced indexes as needed for each foreign key. |
Existing | Spanner backfills the referencing and referenced indexes as needed. Spanner also validates existing data in the table to ensure that it complies with the referential integrity constraint of the foreign key. The schema change fails if any data is invalid. | Spanner backfills the referenced index as needed and doesn't validate existing data in the table. |
The following aren't supported:
- Adding a foreign key action to an existing enforced foreign key constraint.
- Changing the enforcement of an existing foreign key.
For both cases, we recommend that you instead do the following:
- Add a new constraint with the required action or enforcement.
- Drop the old constraint.
Adding a new constraint and dropping the old constraint prevents a Long-running
Alter Constraint Operation issue. For example, suppose you want to add a
DELETE CASCADE
action on an existing foreign key. After you create the new
foreign key with the ON DELETE CASCADE
action, the effect of both constraints
is a DELETE CASCADE
action. Then you can drop the old constraint safely.
Dropping a constraint can lead to dropping the foreign key backing indexes if the indexes aren't used by other foreign key constraints. Because of this, if you drop the old constraint first, adding the same foreign key constraint with an action later might lead to long-running operations, such as backfilling indexes, validating unique index constraints, or validating foreign key referential constraints.
You can query INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS.SPANNER_STATE
to
check foreign key creation state.
What's next
Learn about Creating and managing foreign key relationships.
Learn more about the information schema.