This topic describes how to examine a Spanner component to find the source of latency and visualize that latency using OpenTelemetry. For a high-level overview of the components in this topic, see Latency points in a Spanner request.
OpenTelemetry is an open source observability framework and toolkit that lets you create and manage telemetry data such as traces, metrics, and logs. It is the result of a merger between OpenTracing and OpenCensus. For more information, see What is OpenTelemetry?
Spanner client libraries provide metrics and traces with the use of the OpenTelemetry observability framework. Follow the procedure in Identify the latency point to find the components or components that are showing latency in Spanner.
Before you begin
Before you begin capturing latency metrics, familiarize yourself with manual instrumentation with OpenTelemetry. You must configure the OpenTelemetry SDK with appropriate options for exporting your telemetry data. There are multiple OpenTelemetry exporter options available. We recommend using the OpenTelemetry Protocol (OTLP) exporter. Other options include using an OTel Collector with Google Cloud Exporter or Google Managed Service for Prometheus Exporter.
If you are running your application on Compute Engine, you can use the Ops Agent to collect OpenTelemetry Protocol metrics and traces. For more information, see Collect OTLP metrics and traces.
Add dependencies
To configure the OpenTelemetry SDK and OTLP exporter, add the following dependencies to your application.
Java
Go
Inject the OpenTelemetry object
Then, create an OpenTelemetry object with the OTLP exporter and inject the
OpenTelemetry object using SpannerOptions
.
Java
Go
Capture and visualize client round-trip latency
Client round-trip latency is the length of time (in milliseconds) between the first byte of the Spanner API request that the client sends to the database (through both the GFE and the Spanner API frontend), and the last byte of response that the client receives from the database.
Capture client round-trip latency
The Spanner client round-trip latency metric is not supported using OpenTelemetry. You can instrument the metric using OpenCensus with a bridge and migrate the data to OpenTelemetry.
Visualize client round-trip latency
After retrieving the metrics, you can visualize client round-trip latency in Cloud Monitoring.
Here's an example of a graph that illustrates the 5th percentile latency for the client round-trip latency metric. To change the percentile latency to either the 50th or the 99th percentile, use the Aggregator menu.
The program creates a view called roundtrip_latency
. This string becomes part
of the name of the metric when it's exported to Cloud Monitoring.
Capture and visualize GFE latency
Google Front End (GFE) latency is the length of time (in milliseconds) between when the Google network receives a remote procedure call from the client and when the GFE receives the first byte of the response.
Capture GFE latency
You can capture GFE latency metrics by enabling the following options using the Spanner client library.
Java
Go
Visualize GFE latency
After retrieving the metrics, you can visualize GFE latency in Cloud Monitoring.
Here's an example of a graph that illustrates the distribution aggregation for the GFE latency metric. To change the percentile latency to the 5th, 50th, 95th, or 99th percentile, use the Aggregator menu.
The program creates a view called spanner/gfe_latency
. This
string becomes part of the name of the metric when it's exported to
Cloud Monitoring.
Capture and visualize Spanner API request latency
Spanner API request latency is the length of time (in seconds) between the first byte of request that the Spanner API frontend receives and the last byte of response that the Spanner API frontend sends.
Capture Spanner API request latency
By default, this latency is available as part of Cloud Monitoring metrics. You don't have to do anything to capture and export it.
Visualize Spanner API request latency
You can use the Metrics Explorer
charting tool to visualize the graph for the
spanner.googleapis.com/api/request_latencies
metric in Cloud Monitoring.
Here's an example of a graph that illustrates the 5th percentile latency for the Spanner API request latency metric. To change the percentile latency to either the 50th or the 99th percentile, use the Aggregator menu.
Capture and visualize query latency
Query latency is the length of time (in milliseconds) that it takes to run SQL queries in the Spanner database.
Capture query latency
You can capture query latency for the following languages:
Java
Go
Visualize query latency
After retrieving the metrics, you can visualize query latency in Cloud Monitoring.
Here's an example of a graph that illustrates the distribution aggregation for the GFE latency metric. To change the percentile latency to the 5th, 50th, 95th, or 99th percentile, use the Aggregator menu.
The program creates an OpenCensus view called query_stats_elapsed
.
This string becomes part of the name of the metric when it's exported to
Cloud Monitoring.
What's next
Learn more about OpenTelemetry.
Learn how to migrate to OpenTelemetry.
Learn how to use metrics to diagnose latency.