Observability with proxyless gRPC applications
Microservices observability tools provide you with the ability to instrument your applications to collect and present telemetry data in Cloud Monitoring, Cloud Logging, and Cloud Trace from gRPC workloads deployed on Google Cloud, including gRPC workloads in Cloud Service Mesh.
gRPC clients and servers are integrated with OpenCensus to export metrics and traces to various backends, including Trace and Monitoring. You can do this with the following gRPC languages:
- C++
- Go
- Java
Read the Microservices observability overview, then use the instructions in Set up Microservices observability to instrument your gRPC workloads for the following:
- Cloud Monitoring and viewing metrics.
- Cloud Logging and viewing logs.
- Cloud Trace and viewing traces.
Use the instructions in this document for the following tasks:
- Viewing traces.
- Exposing the admin interface.
- Using the
grpcdebug
tool and other tools to debug your applications.
View traces on Trace
After you complete the setup process, your instrumented gRPC clients and servers send traces to Trace. The Trace Overview page in the Google Cloud console shows you a list of recent traces. You can select an individual trace to see a breakdown of your traffic, similar to what's described in the following section.
Trace compatibility with the Envoy proxy
Exporting traces to Trace with Cloud Service Mesh and the Envoy
proxy, as described in
Observability with Envoy, uses
Envoy's OpenCensus tracer configuration, which allows traces exported from
proxyless gRPC applications and Envoy proxies to be fully compatible within a
service mesh. For compatibility with proxyless gRPC, the Envoy bootstrap needs
to configure the trace context to include the GRPC_TRACE_BIN
trace format
in its OpenCensusConfig
. For example:
tracing: http: name: envoy.tracers.opencensus typed_config: "@type": type.googleapis.com/envoy.config.trace.v2.OpenCensusConfig stackdriver_exporter_enabled: "true" stackdriver_project_id: "PROJECT_ID" incoming_trace_context: ["CLOUD_TRACE_CONTEXT", "GRPC_TRACE_BIN"] outgoing_trace_context: ["CLOUD_TRACE_CONTEXT", "GRPC_TRACE_BIN"]
Expose the admin interface
Sometimes, metrics and tracing data are not sufficient for resolving an issue. You might need to look at the configuration or the runtime state of the gRPC library in your application. This information includes resolver information, the state of connectivity to peers, RPC statistics on a channel, and the configuration received from Cloud Service Mesh.
To obtain such information, gRPC applications can expose the admin interface on a particular port. You can then query the application to understand how the services are configured and how they are running. In this section, you can find instructions about how to configure the admin interface for applications written in each supported language.
We recommend that you build a separate gRPC server in your application that
listens on a port reserved for this purpose. This lets you access your gRPC
applications even when the data ports are inaccessible because of
misconfiguration or network issues. We recommend that you expose the admin
interface only on localhost
or on a Unix domain socket.
The following code snippets show how to create an admin interface.
C++
In C++, use this code to create an admin interface:
#include <grpcpp/ext/admin_services.h> grpc::ServerBuilder builder; grpc::AddAdminServices(&builder); builder.AddListeningPort(":50051", grpc::ServerCredentials(...)); std::unique_ptr<grpc::Server> server(builder.BuildAndStart());
Go
In Go, use this code to create an admin interface:
import "google.golang.org/grpc/admin" lis, err := net.Listen("tcp", ":50051") if err != nil { log.Fatalf("failed to listen: %v", err) } defer lis.Close() grpcServer := grpc.NewServer(...opts) cleanup, err := admin.Register(grpcServer) if err != nil { log.Fatalf("failed to register admin services: %v", err) } defer cleanup() if err := grpcServer.Serve(lis); err != nil { log.Fatalf("failed to serve: %v", err) }
Java
In Java, use this code to create an admin interface:
import io.grpc.services.AdminInterface; server = ServerBuilder.forPort(50051) .useTransportSecurity(certChainFile, privateKeyFile) .addServices(AdminInterface.getStandardServices()) .build() .start(); server.awaitTermination();
Python
In Python, use this code to create an admin interface:
import grpc_admin server = grpc.server(futures.ThreadPoolExecutor()) grpc_admin.add_admin_servicers(server) server.add_insecure_port('[::]:50051') server.start() server.wait_for_termination()
Use SSH to connect to a VM
The gRPC Wallet example already enables the admin interface. You can change the admin interface port by providing the following flag:
--admin-port=PORT
The default admin port is localhost:28881
.
To debug your gRPC application, you can use SSH to connect to one of the VMs
that serves the wallet-service
. This gives you access to the localhost
.
# List the Wallet VMs $ gcloud compute instances list --filter="zone:(us-central1-a)" --filter="name~'grpcwallet-wallet-v2'" NAME ZONE MACHINE_TYPE PREEMPTIBLE INTERNAL_IP EXTERNAL_IP STATUS grpcwallet-wallet-v2-mig-us-central1-ccl1 us-central1-a n1-standard-1 10.240.0.38 35.223.42.98 RUNNING grpcwallet-wallet-v2-mig-us-central1-k623 us-central1-a n1-standard-1 10.240.0.112 35.188.133.75 RUNNING # Pick one of the Wallet VMs to debug $ gcloud compute ssh grpcwallet-wallet-v2-mig-us-central1-ccl1 --zone=us-central1-a
Install the grpcdebug
tool
To access the admin interface, you need a gRPC client that can communicate with
the admin services in your gRPC application. In the following examples, you use
a tool called grpcdebug
that you can download and install on the VM or Pod
where your gRPC application is running. The repository for grpcdebug
is
located at
grpc-ecosystem/grpcdebug.
The minimum support Golang version is 1.12. The official Golang installation
guide is at the Golang site.
If you are following the guide to create a Linux VM for the wallet-service
,
you can install Golang 1.16 by using these commands:
sudo apt update && sudo apt install -y wget wget https://golang.org/dl/go1.16.3.linux-amd64.tar.gz sudo rm -rf /usr/local/go sudo tar -C /usr/local -xzf go1.16.3.linux-amd64.tar.gz export PATH=$PATH:/usr/local/go/bin sudo ln -sf /usr/local/go/bin/go /usr/bin/go go version # go version go1.16.3 linux/amd64
You install the grpcdebug
tool with the following commands:
go install -v github.com/grpc-ecosystem/grpcdebug@latest export PATH=$PATH:$(go env GOPATH)/bin
You now have access to the grpcdebug
command-line interface. The help output
contains information about supported commands:
$ grpcdebug -h grpcdebug is a gRPC service admin command-line interface Usage: grpcdebug <target address> [flags] <command> Available Commands: channelz Display gRPC states in human readable way. health Check health status of the target service (default ""). help Help about any command xds Fetch xDS related information. Flags: --credential_file string Sets the path of the credential file; used in [tls] mode -h, --help Help for grpcdebug --security string Defines the type of credentials to use [tls, google-default, insecure] (default "insecure") --server_name_override string Overrides the peer server name if non empty; used in [tls] mode -t, --timestamp Print timestamp as RFC 3339 instead of human readable strings -v, --verbose Print verbose information for debugging
To obtain more information about a particular command, use the following:
grpcdebug <target address> [command] --help
Use the grpcdebug
tool to debug your applications
You can use the grpcdebug
tool to debug your applications.
The grpcdebug
tool provides an ssh_config
-like configuration that supports
aliasing, hostname rewriting, and connection security setting (insecure/TLS).
For more information about this advanced feature, see
grpcdebug/Connect&Security
.
The following sections describe the services exposed by the admin interface and how to access them.
Use Channelz
The Channelz service provides access to runtime information about the connections at different levels in the gRPC library of your application. You can use this for live analysis of applications that might have configuration- or network-related issues. The following examples assume that you deployed the gRPC Wallet example by using the instructions in Configure advanced traffic management with proxyless gRPC services and that you provided the following flag:
--admin-port=PORT
After you send some RPCs from a test client, as shown in Verifying the configuration, use the following commands to access the Channelz data for gRPC services:
- Use SSH to connect to a VM that is running the
wallet-service
. - Set up
grpcdebug
to connect to the running gRPC application.
The default output of grpcdebug
is in a console-friendly table
format. If you supply the --json
flag, the output is encoded as JSON
.
The grpcdebug channelz
command is used to fetch and present debugging
information from the Channelz service. The command works for both gRPC clients
and gRPC servers.
For gRPC clients, the command grpcdebug channelz channels
provides a list of
existing channels and some basic information:
$ grpcdebug localhost:28881 channelz channels Channel ID Target State Calls(Started/Succeeded/Failed) Created Time 1 xds:///account.grpcwallet.io:10080 READY 0/0/0 59 seconds ago 2 trafficdirector.googleapis.com:443 READY 2/0/0 59 seconds ago 4 xds:///stats.grpcwallet.io:10080 READY 0/0/0 59 seconds ago
If you need additional information about a particular channel, you can use
grpcdebug channelz channel [CHANNEL_ID]
to inspect detailed
information for that channel. The channel identifier can be the channel ID or
the target address, if there is only one target address. A gRPC channel can
contain multiple subchannels, which is gRPC's abstraction on top of a TCP
connection.
$ grpcdebug localhost:28881 channelz channel 2 Channel ID: 2 Target: trafficdirector.googleapis.com:443 State: READY Calls Started: 2 Calls Succeeded: 0 Calls Failed: 0 Created Time: 10 minutes ago --- Subchannel ID Target State Calls(Started/Succeeded/Failed) CreatedTime 3 trafficdirector.googleapis.com:443 READY 2/0/0 10 minutes ago --- Severity Time Child Ref Description CT_INFO 10 minutes ago Channel Created CT_INFO 10 minutes ago parsed scheme: "" CT_INFO 10 minutes ago scheme "" not registered, fallback to default scheme CT_INFO 10 minutes ago ccResolverWrapper: sending update to cc: {[{trafficdirector.googleapis.com:443 <nil> 0 <nil>}] <nil> <nil>} CT_INFO 10 minutes ago Resolver state updated: {Addresses:[{Addr:trafficdirector.googleapis.com:443 ServerName: Attributes:<nil> Type:0 Metadata:<nil>}] ServiceConfig:<nil> Attributes:<nil>} (resolver returned new addresses) CT_INFO 10 minutes ago ClientConn switching balancer to "pick_first" CT_INFO 10 minutes ago Channel switches to new LB policy "pick_first" CT_INFO 10 minutes ago subchannel(subchannel_id:3 ) Subchannel(id:3) created CT_INFO 10 minutes ago Channel Connectivity change to CONNECTING CT_INFO 10 minutes ago Channel Connectivity change to READY
You can also inspect detailed information for a subchannel:
$ grpcdebug localhost:28881 channelz subchannel 3 Subchannel ID: 3 Target: trafficdirector.googleapis.com:443 State: READY Calls Started: 2 Calls Succeeded: 0 Calls Failed: 0 Created Time: 12 minutes ago --- Socket ID Local->Remote Streams(Started/Succeeded/Failed) Messages(Sent/Received) 9 10.240.0.38:60338->142.250.125.95:443 2/0/0 214/132
You can retrieve information about TCP sockets:
$ grpcdebug localhost:28881 channelz socket 9 Socket ID: 9 Address: 10.240.0.38:60338->142.250.125.95:443 Streams Started: 2 Streams Succeeded: 0 Streams Failed: 0 Messages Sent: 226 Messages Received: 141 Keep Alives Sent: 0 Last Local Stream Created: 12 minutes ago Last Remote Stream Created: a long while ago Last Message Sent Created: 8 seconds ago Last Message Received Created: 8 seconds ago Local Flow Control Window: 65535 Remote Flow Control Window: 966515 --- Socket Options Name Value SO_LINGER [type.googleapis.com/grpc.channelz.v1.SocketOptionLinger]:{duration:{}} SO_RCVTIMEO [type.googleapis.com/grpc.channelz.v1.SocketOptionTimeout]:{duration:{}} SO_SNDTIMEO [type.googleapis.com/grpc.channelz.v1.SocketOptionTimeout]:{duration:{}} TCP_INFO [type.googleapis.com/grpc.channelz.v1.SocketOptionTcpInfo]:{tcpi_state:1 tcpi_options:7 tcpi_rto:204000 tcpi_ato:40000 tcpi_snd_mss:1408 tcpi_rcv_mss:1408 tcpi_last_data_sent:8212 tcpi_last_data_recv:8212 tcpi_last_ack_recv:8212 tcpi_pmtu:1460 tcpi_rcv_ssthresh:88288 tcpi_rtt:2400 tcpi_rttvar:3012 tcpi_snd_ssthresh:2147483647 tcpi_snd_cwnd:10 tcpi_advmss:1408 tcpi_reordering:3} --- Security Model: TLS Standard Name: TLS_AES_128_GCM_SHA256
On the server side, you can use Channelz to inspect your server application's
status. For example, you can get the list of servers by using the grpcdebug
channelz servers
command:
$ grpcdebug localhost:28881 channelz servers Server ID Listen Addresses Calls(Started/Succeeded/Failed) Last Call Started 5 [127.0.0.1:28881] 9/8/0 now 6 [[::]:50051] 159/159/0 4 seconds ago
To obtain more information about a specific server, use the grpcdebug channelz
server
command. You can inspect server sockets the same way that you
inspect client sockets.
$ grpcdebug localhost:28881 channelz server 6 Server Id: 6 Listen Addresses: [[::]:50051] Calls Started: 174 Calls Succeeded: 174 Calls Failed: 0 Last Call Started: now --- Socket ID Local->Remote Streams(Started/Succeeded/Failed) Messages(Sent/Received) 25 10.240.0.38:50051->130.211.1.39:44904 68/68/0 68/68 26 10.240.0.38:50051->130.211.0.167:32768 54/54/0 54/54 27 10.240.0.38:50051->130.211.0.22:32768 52/52/0 52/52
Use the Client Status Discovery Service
The Client Status Discovery Service (CSDS) API is part of the xDS APIs. In a gRPC application, the CSDS service provides access to the configuration (also called the xDS configuration) that it receives from Cloud Service Mesh. This lets you identify and resolve configuration-related issues in your mesh.
The following examples assume that you deployed the gRPC Wallet example by using the instructions in Configure advanced traffic management with proxyless gRPC services.
To use CSDS to examine the configuration:
- Use SSH to connect to a VM that is running the
wallet-service
. Use the instructions in Use SSH to connect to a VM. - Run the
grpcdebug
client.
To get an overview of configuration status, run the following command:
grpcdebug localhost:28881 xds status
You see results similar to the following:
Name Status Version Type LastUpdated account.grpcwallet.io:10080 ACKED 1618529574783547920 type.googleapis.com/envoy.config.listener.v3.Listener 3 seconds ago stats.grpcwallet.io:10080 ACKED 1618529574783547920 type.googleapis.com/envoy.config.listener.v3.Listener 3 seconds ago URL_MAP/830293263384_grpcwallet-url-map_0_account.grpcwallet.io:10080 ACKED 1618529574783547920 type.googleapis.com/envoy.config.route.v3.RouteConfiguration 3 seconds ago URL_MAP/830293263384_grpcwallet-url-map_1_stats.grpcwallet.io:10080 ACKED 1618529574783547920 type.googleapis.com/envoy.config.route.v3.RouteConfiguration 3 seconds ago cloud-internal-istio:cloud_mp_830293263384_3566964729007423588 ACKED 1618529574783547920 type.googleapis.com/envoy.config.cluster.v3.Cluster 3 seconds ago cloud-internal-istio:cloud_mp_830293263384_7383783194368524341 ACKED 1618529574783547920 type.googleapis.com/envoy.config.cluster.v3.Cluster 3 seconds ago cloud-internal-istio:cloud_mp_830293263384_3363366193797120473 ACKED 1618529574783547920 type.googleapis.com/envoy.config.cluster.v3.Cluster 3 seconds ago cloud-internal-istio:cloud_mp_830293263384_3566964729007423588 ACKED 86 type.googleapis.com/envoy.config.endpoint.v3.ClusterLoadAssignment 2 seconds ago cloud-internal-istio:cloud_mp_830293263384_3363366193797120473 ACKED 86 type.googleapis.com/envoy.config.endpoint.v3.ClusterLoadAssignment 2 seconds ago cloud-internal-istio:cloud_mp_830293263384_7383783194368524341 ACKED 86 type.googleapis.com/envoy.config.endpoint.v3.ClusterLoadAssignment 2 seconds ago
You can find the definition of configuration status in
documentation for the Envoy proxy.
Briefly, the status of an xDS resource is one of REQUESTED
, DOES_NOT_EXIST
,
ACKED
, or NACKED
.
To obtain a raw xDS configuration dump, run the following command:
grpcdebug localhost:28881 xds config
You see a JSON
list of the
PerXdsConfig
object:
{ "config": [ { "node": { "id": "projects/830293263384/networks/default/nodes/6e98b038-6d75-4a4c-8d35-b0c7a8c9cdde", "cluster": "cluster", "metadata": { "INSTANCE_IP": "10.240.0.38", "TRAFFICDIRECTOR_GCP_PROJECT_NUMBER": "830293263384", "TRAFFICDIRECTOR_NETWORK_NAME": "default" }, "locality": { "zone": "us-central1-a" }, "userAgentName": "gRPC Go", "userAgentVersion": "1.37.0", "clientFeatures": [ "envoy.lb.does_not_support_overprovisioning" ] }, "xdsConfig": [ { "listenerConfig": { "versionInfo": "1618529930989701137", "dynamicListeners": [ { ...
If the raw configuration output is too verbose, grpcdebug
lets you filter
based on specific xDS types. For example:
$ grpcdebug localhost:28881 xds config --type=cds { "versionInfo": "1618530076226619310", "dynamicActiveClusters": [ { "versionInfo": "1618530076226619310", "cluster": { "@type": "type.googleapis.com/envoy.config.cluster.v3.Cluster", "name": "cloud-internal-istio:cloud_mp_830293263384_7383783194368524341", "altStatName": "/projects/830293263384/global/backendServices/grpcwallet-stats-service", "type": "EDS", "edsClusterConfig": { "edsConfig": { "ads": {}, "initialFetchTimeout": "15s", ...
You can also dump the configuration of seberal xDS types at the same time:
$ grpcdebug localhost:28881 xds config --type=lds,eds { "versionInfo": "1618530076226619310", "dynamicListeners": [...] } { "dynamicEndpointConfigs": [...] }
What's next
- To find related information, see Observability with Envoy.
- To resolve configuration issues when you deploy proxyless gRPC services, see Troubleshooting deployments that use proxyless gRPC.