This page describes the latency metrics that Cloud Spanner provides. If your application experiences high latency, use these metrics to help you diagnose and resolve the issue.
Overview of latency metrics
The latency metrics for Cloud Spanner measure how long it took for the Cloud Spanner service to process a request. The metric captures the actual amount of time that elapsed, not the amount of CPU time that Cloud Spanner used.
These latency metrics do not include latency that occurs outside of Cloud Spanner, such as network latency and latency within your application layer. To measure other types of latency, you can use Cloud Monitoring to instrument your application with custom metrics.
You can view charts of latency metrics in the console and in the Cloud Monitoring console. You can view combined latency metrics that include both reads and writes, or you can view separate metrics for reads and writes.
Based on the latency of each request, Cloud Spanner groups the requests into percentiles. You can view latency metrics for 50th percentile and 99th percentile latency:
50th percentile latency: The maximum latency, in seconds, for the fastest 50% of all requests. For example, if the 50th percentile latency is 0.5 seconds, then Cloud Spanner processed 50% of requests in less than 0.5 seconds.
This metric is sometimes called the median latency.
99th percentile latency: The maximum latency, in seconds, for the fastest 99% of requests. For example, if the 99th percentile latency is 2 seconds, then Cloud Spanner processed 99% of requests in less than 2 seconds.
Latency and operations per second
When an instance processes a small number of requests during a period of time, the 50th and 99th percentile latencies during that time are not meaningful indicators of the instance's overall performance. Under these conditions, a very small number of outliers can drastically change the latency metrics.
For example, suppose that an instance processes 100 requests during an hour. In this case, the 99th percentile latency for the instance during that hour is the amount of time it took to process the slowest request. A latency measurement based on a single request is not meaningful.
How to diagnose latency issues
The following sections describe how to diagnose several common issues that could cause your application to experience high end-to-end latency.
High total latency, low Cloud Spanner latency
If your application experiences latency that is higher than expected, but the latency metrics for Cloud Spanner are significantly lower than the total end-to-end latency, there might be an issue in your application code. If your application has a performance issue that causes some code paths to be slow, the total end-to-end latency for each request might increase.
To check for this issue, benchmark your application to identify code paths that are slower than expected.
You can also comment out the code that communicates with Cloud Spanner, then measure the total latency again. If the total latency doesn't change very much, then Cloud Spanner is unlikely to be the cause of the high latency.
High total latency, high Cloud Spanner latency
If your application experiences latency that is higher than expected, and the Cloud Spanner latency metrics are also high, there are a few likely causes:
Your instance needs more compute capacity. If your instance does not have enough CPU resources, and its CPU utilization exceeds the recommended maximum, then Cloud Spanner might not be able to process your requests quickly and efficiently.
Some of your queries cause high CPU utilization. If your queries do not take advantage of Cloud Spanner features that improve efficiency, such as query parameters and secondary indexes, or if they include a large number of joins or other CPU-intensive operations, the queries can use a large portion of the CPU resources for your instance.
To check for these issues, use the Cloud Monitoring console to look for a correlation between high CPU utilization and high latency. Also, check the query statistics for your instance to identify any CPU-intensive queries during the same time period.
If you find that CPU utilization and latency are both high at the same time, take action to address the issue:
If you did not find many CPU-intensive queries, add compute capacity to the instance.
Adding compute capacity provides more CPU resources and enables Cloud Spanner to handle a larger workload.
You might also need to review the schema design for the database and update the schema to allow for more efficient queries.
- Monitor your instance with the console or the Cloud Monitoring console.
- Learn how to find correlations between high latency and other metrics.
- Understand how to reduce read latency by following SQL best practices and using timestamp bounds.
- Find out about latency metrics in query statistics tables, which you can retrieve using SQL statements.
- Understand how instance configuration affects latency.