Google Cloud Spanner Dialect for SQLAlchemy
We're very excited to announce the general availability of the Google Cloud Spanner dialect that enables SQLAlchemy applications to take advantage of Cloud Spanner's scale, strong consistency, and up to 99.999% availability. SQLAlchemy is a Python SQL toolkit and Object Relational Mapper, giving application developers the full power and flexibility of SQL. Its object-relational mapper (ORM) allows the object model and database schema to be developed in a cleanly decoupled way, while also being transparent to the user about the underlying SQL and object relational details.
Here, we'll show how you can get started with using the dialect and also highlight the supported Spanner-specific features.
Set up the Dialect
To set up the Cloud Spanner dialect for SQLAlchemy in your application, install the package that is available through PyPI.
You may also install the dialect from source.
Set up Cloud Spanner
Before you begin using Cloud Spanner:
Follow the Set Up guide to configure a Cloud Project, authentication and authorization.
Then, create a Cloud Spanner instance and database following the Quickstart using the Cloud Console.
Create an Engine
First, we create an Engine that can then be used to make a connection to a Cloud Spanner database. We provide the Database URL as an input which has the format of
We then create a MetaData object bound to the engine that can hold a collection of Table objects and their schemas. We declare three tables representing "Singers", "Albums" and "Tracks", then call
MetaData.create_all() to create the tables.
Notice that we use two different approaches to defining relationships between tables:
Albums reference Singers using a foreign key constraint, by including the "SingerId" in the "Albums" table. This ensures that each Album references an existing Singer record, and that a Singer cannot be deleted without also deleting all Albums of that Singer.
Tracks reference Albums by being interleaved in the parent "Albums" table, configured with the "spanner_interleave_in" keyword argument. This ensures that all Track records are stored physically together with the parent Album, which makes accessing them together more efficient. Setting "spanner_interleave_on_delete_cascade" ensures that when an Album is deleted, the interleaved Tracks are also deleted. We also call Table.add_is_dependent_on() to ensure that the Albums table is created before the Tracks table.
Data can be inserted into the created tables by calling the
Table.insert() method through
Connection.execute(). We use the uuid module to generate primary key fields to avoid creating monotonically increasing keys that may result in hotspots.
We can then query the inserted data by running a
select() statement through
Migrate an existing database
The Cloud Spanner dialect for SQLAlchemy supports migrations through Alembic.
Note that a migration script can produce a lot of DDL statements, and if each statement is executed separately, the migration will be slow. Therefore, it is highly recommended that you use the Alembic batch context feature to pack DDL statements into groups of statements.
For optimal performance, use explicit connections, and reuse the connection across multiple queries:
It's also possible to use an implicit connection but this is not recommended as the dialect will then need to establish a new connection to the database for every
Connection.execute() call. Therefore, avoid calls like the following:
By default, a connection executes all transactions in ReadWrite mode. But if the "read_only" execution option is provided when creating a connection, as in the following example, then transactions can be executed in ReadOnly mode.
The dialect supports "SERIALIZABLE" and "AUTOCOMMIT" isolation levels. "SERIALIZABLE" is the default isolation level, and transactions need to be committed explicitly. If the "AUTOCOMMIT" isolation level is selected, then each statement will be committed automatically.
The isolation level can be set as follows:
Cloud Spanner features
Besides interleaved tables and generated columns mentioned above, the provider also supports the following Cloud Spanner features.
Cloud Spanner supports various statement hints and table hints, which can be configured in the dialect through Query.with_hint(). This example shows how to set a table hint.
Cloud Spanner provides two read types. By default all read-only transactions will default to performing strong reads. You can opt into performing a stale read when querying data by using an explicit timestamp bound as shown in this example. Note that the connection must be a "read_only" connection to perform a stale read.
Differences and limitations
Cloud Spanner doesn't support direct UNIQUE constraints creation. In order to achieve column values uniqueness UNIQUE indexes should be used, as shown in this example.
DDL in Transactions
Cloud Spanner doesn't allow DDL statements to be executed in a transaction. Therefore, DDL statements will not be rolled back on transaction rollback.
The dialect and the underlying DB API driver doesn't support mutations. Therefore, only DML statements can be used for executing updates.
"WITH RECURSIVE" statements and named schemas are not supported since Spanner doesn't support these features.
"CREATE TEMPORARY TABLE" is not supported since Spanner doesn't support this feature.
The scale and precision for numeric types are fixed as Spanner doesn't support arbitrary scale or precision in numeric types.
The Cloud Spanner dialect for SQLAlchemy is an open-source project on GitHub and we welcome contributions in the form of feedback or pull requests.
We would like to thank Skylar Pottinger, Ilya Gurov, Hemang Chothani, Alex Vaksman and Knut Olav Løite for their work on this integration.