Examine latency in a Spanner component with OpenCensus
Stay organized with collections
Save and categorize content based on your preferences.
This topic describes how to examine a Spanner component
to find the source of latency and visualize that latency using OpenCensus.
For a high-level overview of the components in this topic, see
Latency points in a Spanner request.
Spanner client libraries provide statistics and traces with the
use of the OpenCensus observability framework. This framework gives insights
into the internals of the client, and aids in troubleshooting end-to-end
(round-trip) latency. By default, the framework is disabled.
Before you begin
Use the procedure in Identify the latency point
to find the components or components that are showing latency.
Capture and visualize client round-trip latency
Client round-trip latency is the length of time (in milliseconds) between the
first byte of the Spanner API request that the client sends to
the database (through both the Google Front End (GFE) and the
Spanner API front end), and the last byte of response that the
client receives from the database.
Capture client round-trip latency
You can capture client round-trip latency for the following languages:
Java
static void captureGrpcMetric(DatabaseClient dbClient) {
// Add io.grpc:grpc-census and io.opencensus:opencensus-exporter-stats-stackdriver
// dependencies to enable gRPC metrics.
// Register basic gRPC views.
RpcViews.registerClientGrpcBasicViews();
// Enable OpenCensus exporters to export metrics to Stackdriver Monitoring.
// Exporters use Application Default Credentials to authenticate.
// See https://developers.google.com/identity/protocols/application-default-credentials
// for more details.
try {
StackdriverStatsExporter.createAndRegister();
} catch (IOException | IllegalStateException e) {
System.out.println("Error during StackdriverStatsExporter");
}
try (ResultSet resultSet =
dbClient
.singleUse() // Execute a single read or query against Cloud Spanner.
.executeQuery(Statement.of("SELECT SingerId, AlbumId, AlbumTitle FROM Albums"))) {
while (resultSet.next()) {
System.out.printf(
"%d %d %s", resultSet.getLong(0), resultSet.getLong(1), resultSet.getString(2));
}
}
}
Go
import (
"context"
"fmt"
"io"
"regexp"
"cloud.google.com/go/spanner"
"google.golang.org/api/iterator"
"contrib.go.opencensus.io/exporter/stackdriver"
"go.opencensus.io/plugin/ocgrpc"
"go.opencensus.io/stats/view"
)
var validDatabasePattern = regexp.MustCompile("^projects/(?P<project>[^/]+)/instances/(?P<instance>[^/]+)/databases/(?P<database>[^/]+)$")
func queryWithGRPCMetric(w io.Writer, db string) error {
projectID, _, _, err := parseDatabaseName(db)
if err != nil {
return err
}
ctx := context.Background()
client, err := spanner.NewClient(ctx, db)
if err != nil {
return err
}
defer client.Close()
// Register OpenCensus views.
if err := view.Register(ocgrpc.DefaultClientViews...); err != nil {
return err
}
// Create OpenCensus Stackdriver exporter.
sd, err := stackdriver.NewExporter(stackdriver.Options{
ProjectID: projectID,
})
if err != nil {
return err
}
// It is imperative to invoke flush before your main function exits
defer sd.Flush()
// Start the metrics exporter
sd.StartMetricsExporter()
defer sd.StopMetricsExporter()
stmt := spanner.Statement{SQL: `SELECT SingerId, AlbumId, AlbumTitle FROM Albums`}
iter := client.Single().Query(ctx, stmt)
defer iter.Stop()
for {
row, err := iter.Next()
if err == iterator.Done {
return nil
}
if err != nil {
return err
}
var singerID, albumID int64
var albumTitle string
if err := row.Columns(&singerID, &albumID, &albumTitle); err != nil {
return err
}
fmt.Fprintf(w, "%d %d %s\n", singerID, albumID, albumTitle)
}
}
func parseDatabaseName(databaseUri string) (project, instance, database string, err error) {
matches := validDatabasePattern.FindStringSubmatch(databaseUri)
if len(matches) == 0 {
return "", "", "", fmt.Errorf("failed to parse database name from %q according to pattern %q",
databaseUri, validDatabasePattern.String())
}
return matches[1], matches[2], matches[3], nil
}
Visualize client round-trip latency
After retrieving the metrics, you can visualize client round-trip latency in
Cloud Monitoring.
Here's an example of a graph that illustrates the 5th percentile latency for the
client round-trip latency metric. To change the percentile latency to either
the 50th or the 99th percentile, use the Aggregator menu.
The program creates an OpenCensus view called roundtrip_latency.
This string becomes part of the name of the metric when it's exported to Cloud Monitoring.
Capture and visualize GFE latency
Google Front End (GFE) latency is the length of time (in milliseconds) between
when the Google network receives a remote procedure call from the client and
when the GFE receives the first byte of the response.
Capture GFE latency
You can capture GFE latency for the following languages:
Java
static void captureGfeMetric(DatabaseClient dbClient) {
// Capture GFE Latency.
SpannerRpcViews.registerGfeLatencyView();
// Capture GFE Latency and GFE Header missing count.
// SpannerRpcViews.registerGfeLatencyAndHeaderMissingCountViews();
// Capture only GFE Header missing count.
// SpannerRpcViews.registerGfeHeaderMissingCountView();
// Enable OpenCensus exporters to export metrics to Stackdriver Monitoring.
// Exporters use Application Default Credentials to authenticate.
// See https://developers.google.com/identity/protocols/application-default-credentials
// for more details.
try {
StackdriverStatsExporter.createAndRegister();
} catch (IOException | IllegalStateException e) {
System.out.println("Error during StackdriverStatsExporter");
}
try (ResultSet resultSet =
dbClient
.singleUse() // Execute a single read or query against Cloud Spanner.
.executeQuery(Statement.of("SELECT SingerId, AlbumId, AlbumTitle FROM Albums"))) {
while (resultSet.next()) {
System.out.printf(
"%d %d %s", resultSet.getLong(0), resultSet.getLong(1), resultSet.getString(2));
}
}
}
Go
// We are in the process of adding support in the Cloud Spanner Go Client Library
// to capture the gfe_latency metric.
import (
"context"
"fmt"
"io"
"strconv"
"strings"
spanner "cloud.google.com/go/spanner/apiv1"
sppb "cloud.google.com/go/spanner/apiv1/spannerpb"
gax "github.com/googleapis/gax-go/v2"
"google.golang.org/grpc"
"google.golang.org/grpc/metadata"
"contrib.go.opencensus.io/exporter/stackdriver"
"go.opencensus.io/stats"
"go.opencensus.io/stats/view"
"go.opencensus.io/tag"
)
// OpenCensus Tag, Measure and View.
var (
KeyMethod = tag.MustNewKey("grpc_client_method")
GFELatencyMs = stats.Int64("cloud.google.com/go/spanner/gfe_latency",
"Latency between Google's network receives an RPC and reads back the first byte of the response", "ms")
GFELatencyView = view.View{
Name: "cloud.google.com/go/spanner/gfe_latency",
Measure: GFELatencyMs,
Description: "Latency between Google's network receives an RPC and reads back the first byte of the response",
Aggregation: view.Distribution(0.0, 0.01, 0.05, 0.1, 0.3, 0.6, 0.8, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 8.0, 10.0, 13.0,
16.0, 20.0, 25.0, 30.0, 40.0, 50.0, 65.0, 80.0, 100.0, 130.0, 160.0, 200.0, 250.0,
300.0, 400.0, 500.0, 650.0, 800.0, 1000.0, 2000.0, 5000.0, 10000.0, 20000.0, 50000.0,
100000.0),
TagKeys: []tag.Key{KeyMethod}}
)
func queryWithGFELatency(w io.Writer, db string) error {
projectID, _, _, err := parseDatabaseName(db)
if err != nil {
return err
}
ctx := context.Background()
client, err := spanner.NewClient(ctx)
if err != nil {
return err
}
defer client.Close()
// Register OpenCensus views.
err = view.Register(&GFELatencyView)
if err != nil {
return err
}
// Create OpenCensus Stackdriver exporter.
sd, err := stackdriver.NewExporter(stackdriver.Options{
ProjectID: projectID,
})
if err != nil {
return err
}
// It is imperative to invoke flush before your main function exits
defer sd.Flush()
// Start the metrics exporter
sd.StartMetricsExporter()
defer sd.StopMetricsExporter()
// Create a session.
req := &sppb.CreateSessionRequest{Database: db}
session, err := client.CreateSession(ctx, req)
if err != nil {
return err
}
// Execute a SQL query and retrieve the GFE server-timing header in gRPC metadata.
req2 := &sppb.ExecuteSqlRequest{
Session: session.Name,
Sql: `SELECT SingerId, AlbumId, AlbumTitle FROM Albums`,
}
var md metadata.MD
resultSet, err := client.ExecuteSql(ctx, req2, gax.WithGRPCOptions(grpc.Header(&md)))
if err != nil {
return err
}
for _, row := range resultSet.GetRows() {
for _, value := range row.GetValues() {
fmt.Fprintf(w, "%s ", value.GetStringValue())
}
fmt.Fprintf(w, "\n")
}
// The format is: "server-timing: gfet4t7; dur=[GFE latency in ms]"
srvTiming := md.Get("server-timing")[0]
gfeLtcy, err := strconv.Atoi(strings.TrimPrefix(srvTiming, "gfet4t7; dur="))
if err != nil {
return err
}
// Record GFE t4t7 latency with OpenCensus.
ctx, err = tag.New(ctx, tag.Insert(KeyMethod, "ExecuteSql"))
if err != nil {
return err
}
stats.Record(ctx, GFELatencyMs.M(int64(gfeLtcy)))
return nil
}
Visualize GFE latency
After retrieving the metrics, you can visualize GFE latency in
Cloud Monitoring.
Here's an example of a graph that illustrates the 5th percentile latency for the
GFE latency metric. To change the percentile latency to either the 50th or the
99th percentile, use the Aggregator menu.
The program creates an OpenCensus view called gfe_latency. This
string becomes part of the name of the metric when it's exported to
Cloud Monitoring.
Capture and visualize Spanner API request latency
Spanner API request latency is the length of time (in seconds)
between the first byte of a request that the Spanner API frontend
receives and the last byte of a response that the Spanner API
frontend sends.
Capture Spanner API request latency
By default, this latency is available as part of
Cloud Monitoring metrics.
You don't have to do anything to capture and export it.
Visualize Spanner API request latency
You can use the Metrics Explorer
charting tool to visualize the graph for the spanner.googleapis.com/api/request_latencies
metric in Cloud Monitoring.
Here's an example of a graph that illustrates the 5th percentile latency for the
Spanner API request latency metric. To change the percentile
latency to either the 50th or the 99th percentile, use the Aggregator
menu.
Capture and visualize query latency
Query latency is the length of time (in milliseconds) that it takes to run SQL
queries in the Spanner database.
Capture query latency
You can capture query latency for the following languages:
Java
private static final String MILLISECOND = "ms";
static final List<Double> RPC_MILLIS_BUCKET_BOUNDARIES =
Collections.unmodifiableList(
Arrays.asList(
0.0, 0.01, 0.05, 0.1, 0.3, 0.6, 0.8, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 8.0, 10.0, 13.0,
16.0, 20.0, 25.0, 30.0, 40.0, 50.0, 65.0, 80.0, 100.0, 130.0, 160.0, 200.0, 250.0,
300.0, 400.0, 500.0, 650.0, 800.0, 1000.0, 2000.0, 5000.0, 10000.0, 20000.0, 50000.0,
100000.0));
static final Aggregation AGGREGATION_WITH_MILLIS_HISTOGRAM =
Distribution.create(BucketBoundaries.create(RPC_MILLIS_BUCKET_BOUNDARIES));
static MeasureDouble QUERY_STATS_ELAPSED =
MeasureDouble.create(
"cloud.google.com/java/spanner/query_stats_elapsed",
"The execution of the query",
MILLISECOND);
// Register the view. It is imperative that this step exists,
// otherwise recorded metrics will be dropped and never exported.
static View QUERY_STATS_LATENCY_VIEW = View
.create(Name.create("cloud.google.com/java/spanner/query_stats_elapsed"),
"The execution of the query",
QUERY_STATS_ELAPSED,
AGGREGATION_WITH_MILLIS_HISTOGRAM,
Collections.emptyList());
static ViewManager manager = Stats.getViewManager();
private static final StatsRecorder STATS_RECORDER = Stats.getStatsRecorder();
static void captureQueryStatsMetric(DatabaseClient dbClient) {
manager.registerView(QUERY_STATS_LATENCY_VIEW);
// Enable OpenCensus exporters to export metrics to Cloud Monitoring.
// Exporters use Application Default Credentials to authenticate.
// See https://developers.google.com/identity/protocols/application-default-credentials
// for more details.
try {
StackdriverStatsExporter.createAndRegister();
} catch (IOException | IllegalStateException e) {
System.out.println("Error during StackdriverStatsExporter");
}
try (ResultSet resultSet = dbClient.singleUse()
.analyzeQuery(Statement.of("SELECT SingerId, AlbumId, AlbumTitle FROM Albums"),
QueryAnalyzeMode.PROFILE)) {
while (resultSet.next()) {
System.out.printf(
"%d %d %s", resultSet.getLong(0), resultSet.getLong(1), resultSet.getString(2));
}
Value value = resultSet.getStats().getQueryStats()
.getFieldsOrDefault("elapsed_time", Value.newBuilder().setStringValue("0 msecs").build());
double elapasedTime = Double.parseDouble(value.getStringValue().replaceAll(" msecs", ""));
STATS_RECORDER.newMeasureMap()
.put(QUERY_STATS_ELAPSED, elapasedTime)
.record();
}
}
Go
import (
"context"
"fmt"
"io"
"strconv"
"strings"
"cloud.google.com/go/spanner"
"google.golang.org/api/iterator"
"contrib.go.opencensus.io/exporter/stackdriver"
"go.opencensus.io/stats"
"go.opencensus.io/stats/view"
"go.opencensus.io/tag"
)
// OpenCensus Tag, Measure and View.
var (
QueryStatsElapsed = stats.Float64("cloud.google.com/go/spanner/query_stats_elapsed",
"The execution of the query", "ms")
QueryStatsLatencyView = view.View{
Name: "cloud.google.com/go/spanner/query_stats_elapsed",
Measure: QueryStatsElapsed,
Description: "The execution of the query",
Aggregation: view.Distribution(0.0, 0.01, 0.05, 0.1, 0.3, 0.6, 0.8, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 8.0, 10.0, 13.0,
16.0, 20.0, 25.0, 30.0, 40.0, 50.0, 65.0, 80.0, 100.0, 130.0, 160.0, 200.0, 250.0,
300.0, 400.0, 500.0, 650.0, 800.0, 1000.0, 2000.0, 5000.0, 10000.0, 20000.0, 50000.0,
100000.0),
TagKeys: []tag.Key{}}
)
func queryWithQueryStats(w io.Writer, db string) error {
projectID, _, _, err := parseDatabaseName(db)
if err != nil {
return err
}
ctx := context.Background()
client, err := spanner.NewClient(ctx, db)
if err != nil {
return err
}
defer client.Close()
// Register OpenCensus views.
err = view.Register(&QueryStatsLatencyView)
if err != nil {
return err
}
// Create OpenCensus Stackdriver exporter.
sd, err := stackdriver.NewExporter(stackdriver.Options{
ProjectID: projectID,
})
if err != nil {
return err
}
// It is imperative to invoke flush before your main function exits
defer sd.Flush()
// Start the metrics exporter
sd.StartMetricsExporter()
defer sd.StopMetricsExporter()
// Execute a SQL query and get the query stats.
stmt := spanner.Statement{SQL: `SELECT SingerId, AlbumId, AlbumTitle FROM Albums`}
iter := client.Single().QueryWithStats(ctx, stmt)
defer iter.Stop()
for {
row, err := iter.Next()
if err == iterator.Done {
// Record query execution time with OpenCensus.
elapasedTime := iter.QueryStats["elapsed_time"].(string)
elapasedTimeMs, err := strconv.ParseFloat(strings.TrimSuffix(elapasedTime, " msecs"), 64)
if err != nil {
return err
}
stats.Record(ctx, QueryStatsElapsed.M(elapasedTimeMs))
return nil
}
if err != nil {
return err
}
var singerID, albumID int64
var albumTitle string
if err := row.Columns(&singerID, &albumID, &albumTitle); err != nil {
return err
}
fmt.Fprintf(w, "%d %d %s\n", singerID, albumID, albumTitle)
}
}
Visualize query latency
After retrieving the metrics, you can visualize query latency in
Cloud Monitoring.
Here's an example of a graph that illustrates the 5th percentile latency for the
query latency metric. To change the percentile latency to either the 50th or
the 99th percentile, use the Aggregator menu.
The program creates an OpenCensus view called query_stats_elapsed.
This string becomes part of the name of the metric when it's exported to
Cloud Monitoring.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2024-10-10 UTC."],[],[]]