This page introduces the concepts of Cloud Spanner instances, instance configurations, and nodes. It also describes the differences and tradeoffs between regional and multi-region instances. If you are not familiar with how replication works in Cloud Spanner, read Replication first.
See Creating and Managing Instances for details on how to create, list, edit, and delete instances.
Overview of instances
To use Cloud Spanner, you must first create a Cloud Spanner instance within your Google Cloud Platform project. This instance is an allocation of resources that is used by Cloud Spanner databases created in that instance.
Instance creation includes two important choices: the instance configuration and the node count. These choices determine the location and amount of the instance's serving and storage resources. Your configuration choice is permanent for an instance, but you can change the node count later if needed.
An instance configuration defines the geographic placement and replication of the databases in that instance. When you create an instance, you must configure it as either regional (that is, all the resources are contained within a single GCP region) or multi-region (that is, the resources span more than one region). You make this choice by selecting an instance configuration, which determines where your data is stored for that instance. Regional and multi-region configurations are described in more detail below.
In addition to choosing where your data is stored when you create an instance, you must also choose the node count, or the number of nodes to allocate to that instance. Your choice of node count determines the amount of serving and storage resources that are available to the databases in that instance.
Each node provides up to 2 TiB of storage. The peak read and write throughput values that nodes can provide depend on the instance configuration, as well as on schema design and dataset characteristics. Refer to the Regional configuration performance and Multi-Region configuration performance sections for details.
After you create an instance, you can change the number of nodes for the instance later. You can use either the Cloud Spanner page in the Google Cloud Platform Console or the gcloud command-line tool to change the number of nodes.
Cloud Spanner does not have a suspend mode. Cloud Spanner nodes are dedicated resources, and even when you are not running a workload, Cloud Spanner nodes frequently perform background work to optimize and protect your data.
Nodes vs replicas
If you need to scale up the serving and storage resources in your instance, add more nodes to that instance. Note that adding a node does not increase the number of replicas (which are fixed for a given configuration), but rather increases the resources each replica has in the instance. Adding nodes gives each replica more CPU and RAM, which increases the replica's throughput (that is, more reads and writes per second can occur). Effectively, the number of Cloud Spanner servers in each of the instance's replicas is the same as the node count of the instance. Thus, the total number of servers in a Cloud Spanner instance is the number of nodes the instance has multiplied by the number of replicas in the instance.
If your users and services are located within a single region, choose a regional instance configuration for the lowest latency reads and writes.
Cloud Spanner offers the following regional instance configurations:
|Region Name||Region Description|
For any regional configuration, Cloud Spanner maintains 3 read-write replicas, each within a different Google Cloud Platform availability zone in that region. Each read-write replica contains a full copy of your operational database that is able to serve read-write and read-only requests. Cloud Spanner uses replicas in different zones so that if a single-zone failure occurs, your database remains available.
Regional configurations contain exactly 3 read-write replicas, each of which can vote.
As described in Cloud Spanner Replication, every Cloud Spanner mutation requires a write quorum that's composed of a majority of voting replicas. Write quorums are formed from two out of the three replicas in regional configurations.
For optimal performance, we recommend:
- Follow Schema Design Best Practices.
- Place critical compute resources within the same region as your Cloud Spanner instance.
- Provision enough Cloud Spanner nodes to keep overall CPU utilization under 75%.
When the best practices described above are followed, each Cloud Spanner node can provide up to 10,000 queries per second (QPS) of reads or 2,000 QPS of writes (writing single rows at 1 KB of data per row).
As described above, Cloud Spanner regional configurations replicate data between multiple zones within a single region. However, if your application often needs to read data from multiple geographic locations (for example, to serve data to users in both North America and Asia), or if your writes originate from a different location than your reads (for example, if you have large write workloads in North America and large read workloads in Europe), then a regional configuration might not be optimal.
Multi-region configurations allow you to replicate the database's data not just in multiple zones, but in multiple zones across multiple regions, as defined by the instance configuration. These additional replicas enable you to read data with low latency from multiple locations close to or within the regions in the configuration. There are tradeoffs though, because in a multi-region configuration, the quorum (read-write) replicas are spread across more than one region. Hence, they can incur additional network latency when these replicas communicate with each other to vote on writes. In other words, multi-region configurations enable your application to achieve faster reads in more places at the cost of a small increase in write latency.
|Configuration Name||Configuration Location||Default Leader Region||Additional Read-Write Region|
|Configuration Name||Configuration Locations||Default Leader Region||Additional Read-Write Region||Read-Only Regions|
||North America, Europe, and Asia||
(Oklahoma — private GCP region)
Multi-region instances offer these primary benefits:
99.999% availability, which is greater than the 99.99% availability that Cloud Spanner regional configurations provide.
Data distribution: Cloud Spanner automatically replicates your data between regions with strong consistency guarantees. This allows your data to be stored where it’s used, which can reduce latency and improve the user experience.
External consistency: Even though Cloud Spanner replicates across geographically distant locations, you can still use Cloud Spanner as if it were a database running on a single machine. Transactions are guaranteed to be serializable, and the order of transactions within the database is the same as the order in which clients observe the transactions to have been committed. External consistency is a stronger guarantee than "strong consistency", which is offered by some other products. Read more about this property in TrueTime and External Consistency.
Each multi-region configuration contains two regions that are designated as read-write regions, each of which contains two read-write replicas. One of these read-write regions is designated as the default leader region, which means that it contains your database's leader replicas. Cloud Spanner also places a witness replica in a third region called a witness region.
Each time a client issues a mutation to your database, a write quorum forms, consisting of one of the replicas from the default leader region and any two of the additional four voting replicas. (The quorum could be formed by replicas from two or three of the regions that make up your configuration, depending on which other replicas participate in the vote.) In addition to these 5 voting replicas, the configuration can also contain read-only replicas for serving low-latency reads. The regions that contain read-only replicas are called read-only regions.
In general, the voting regions in a multi-region configuration are placed geographically close—less than a thousand miles apart—to form a low-latency quorum that enables fast writes (see Why Read-Only and Witness Replicas? for more information). However, the regions are still far enough apart—typically, at least a few hundred miles—to avoid coordinated failures.
The next sections describe each of these region types in more detail and provide guidance for how to place your write and read workloads accordingly.
As described above, each multi-region configuration contains two read-write regions, each of which contains two read-write replicas. One of these read-write regions is designated the default leader region where leader replicas are placed. Under normal conditions when all replicas are available, the default leader region contains the leaders and therefore is where writes are first processed. In the event of leadership failure, other replicas in the default leader region automatically assume leadership. In fact, leaders run health checks on themselves and can preemptively give up leadership if they detect they are unhealthy.
The second read-write region contains the additional replicas that are eligible to be leaders. In the unlikely event of the loss of the default leader region, new leader replicas are chosen from the second read-write region.
Read-only regions contain read-only replicas, which can serve low-latency reads to clients that are outside of the read-write regions.
A witness region contains a witness replica, which is used to vote on writes. Witnesses become important in the rare event that the read-write regions become unavailable.
For optimal performance, we recommend:
- Follow Schema Design Best Practices.
- For optimal write latency, place compute resources for write-heavy workloads within or close to the default leader region.
- For optimal read performance outside of the default leader region, use staleness of at least 15 seconds.
- To avoid single-region dependency for your workloads, place critical compute resources in at least two regions.
- Provision enough Cloud Spanner nodes to keep overall CPU utilization under 45% in each region.
Each Cloud Spanner configuration has slightly different performance characteristics based on the replication topology.
When the best practices described above are followed, each Cloud Spanner node can provide the following approximate performance:
|Multi-Region Configuration||Approximate Peak Read (QPS per region)||Approximate Peak Writes (QPS total)|
Note that read guidance is given per region (because reads can be served from anywhere), while write guidance is for the entire configuration. Write guidance assumes that you're writing single rows at 1 KB of data per row.
Tradeoffs: regional vs multi-region configurations
|Regional||99.99%||Lower write latencies within region.||Lower cost; see pricing.||Enables geographic data governance.|
|Multi-region||99.999%||Lower read latencies from multiple geographic regions.||Higher cost; see pricing.||Distributes data across multiple regions within the configuration.|