This launch checklist provides a list of considerations that need to be made prior to launching a production application on Cloud Spanner. It is not intended to be exhaustive, but serves to highlight areas that can have a large impact on production performance.
Choose a suitable instance configuration
Pick an instance configuration (Regional vs Multi-regional) to match your requirements.
If choosing multi-regional instance types, your application accessing Cloud Spanner should be in close proximity to the leader region, you can find more detail on the instances page.
Design your schema for performance at scale
Cloud Spanner's relational data schema is similar to traditional relational databases, with some nuances that should be considered:
- Use Interleaved tables instead of foreign key relationships where applicable.
- Choose a primary key that prevents hotspots.
- Ensure secondary indexes don't create hotspots (similar to primary keys hotspots).
- Create secondary indexes, and store related columns if required.
- Limit the row size.
Understand performance factors
With automatic sharding and the data subsequently being stored in splits, the more targeted a query is, the more performant. Narrowing down to a single interleaved parent and all its children will perform better than queries or operations affecting multiple rows.
We highly recommend benchmarking and testing at scale to ensure issues and bottlenecks are uncovered prior to launch. Cloud Spanner provides query execution plans that can be used with tables during schema design to understand how queries are likely to perform.
Other performance factors to take into account:
- Prefer read-only transactions over more expensive read-write transactions when you are not writing data.
- Design your application to minimize the number of split participants in a transaction. Cloud Spanner can perform transactions across rows on different servers; however, as a rule of thumb, transactions that affect many co-located rows are faster and cheaper than transactions that affect many rows scattered throughout the database, or throughout a large table.
- Use query parameters rather than string literals to improve query performance and statistics monitoring.
Understand limits and quotas
For architectural reasons, and to maintain its high performance and redundancy, Cloud Spanner has certain quotas and limits that should be considered in application design. Quotas can be increased with a lead-time.
For example, there is a limit of 20,000 mutations per commit, and a maximum of 15 joins per query.
These limits along with schema design and hotspot prevention have an impact on bulk loading, so ensure bulk loading best practices are being followed.
Ensure monitoring is in place
Setup Cloud Monitoring to alert you when you are approaching your limits. The default monitoring with Cloud Monitoring requires an active workspace to be set up. This should be done before going live with your application.
Increase the number of nodes if reaching performance metrics for linear scaling of your Cloud Spanner instances. We recommend keeping CPU utilization below 65% for region specific instances, and below 45% for multi-regional instances.
Use Open Census and Cloud Trace to track and troubleshoot application latency with use of the client libraries.
Have a data migration strategy (if required)
Bulk loading of data into Cloud Spanner should take into account the distributed architecture to maintain performance:
- Partition data by primary key
- Avoid pushback and monitor CPU utilization
- Creating secondary indexes after loading the data is generally faster
This blog post is a good example of implementing high throughput writes.
Ensure security configuration is in place
Set up relevant IAM roles to manage security at a database and instance level. Table level security must be managed within the application.
Understand support options
Ensure that you have a strategy in place for getting support.