Using distributed tracing to observe microservice latency with OpenCensus and Cloud Trace

This tutorial shows you how to capture trace information on microservice apps using OpenCensus and Cloud Trace. Distributed tracing is a system that helps you to understand how requests traverse complex microservice architectures and the latency associated with each step, which is a key capability of observability. Apps are transitioning to microservices and the number of components and endpoints involved in a single user transaction increases. Therefore, observability remains critical to operate user services reliably.

For services running on Kubernetes, using a service mesh like Istio enables distributed tracing of service-to-service traffic without the need for dedicated instrumentation. However, you might want to have more control over the traces, you might need to capture app internals in the trace information, or you might need to trace code that's not running on Kubernetes. OpenCensus is an open source library that enables instrumentation of distributed microservice apps to collect traces and metrics across a wide variety of languages, platforms, and environments.

This tutorial is intended for developers, SREs, and DevOps engineers who are looking to understand the fundamentals of distributed tracing and apply them to their services to improve service observability.

This tutorial assumes you're familiar with the following:

The State of DevOps reports identified capabilities that drive software delivery performance. This tutorial will help you with the following capabilities:

Objectives

  • Create a GKE cluster and deploy a sample app.
  • Review OpenCensus instrumentation code.
  • Review traces and logs generated by the instrumentation.

Reference architecture

The following diagram shows the architecture you deploy in this tutorial.

Architecture of tutorial with two GKE clusters.

As illustrated in the preceding diagram, you create two GKE clusters and deploy an app to each of the clusters. User traffic is sent to the frontend app on the frontend cluster. The frontend pod on the frontend cluster communicates to the backend pod on the backend cluster. The backend pod calls an external API endpoint.

You use Cloud Build—a fully managed continuous integration, delivery, and deployment platform—to build container images from the sample code and store them in Container Registry. The GKE clusters pull the images from Container Registry at deployment time.

Sample app

The sample app in this tutorial is composed of two microservices written in Go.

The frontend service accepts HTTP requests on the / URL and calls the backend service. The address of the backend service is defined by the BACKEND environment variable. The value of that variable is set in a ConfigMap object that you create.

The backend service accepts HTTP requests on the /URL and makes an outbound call to an external URL as defined in the DESTINATION_URL environment variable. The value of the variable is set through a ConfigMap object. After the external call is completed, the backend service returns the HTTP status call, for example 200, back to the caller.

Understanding traces, spans, and context

The concept of distributed tracing is best described in the Dapper research paper published by Google. In the paper, the following diagram shows five spans in a trace.

Distributed tracing with five spans in a trace.

The diagram shows a single frontend request that makes two backend requests. The second backend call requires two helper calls to complete. Each call is labeled with its span ID and the ID of the parent span.

A trace is the total of information that describes how a distributed system responds to a user request. Traces are composed of spans, where each span represents a specific request and response pair involved in serving the user request. The parent span describes the latency as observed by the end user. Each of the child spans describes how a particular service in the distributed system was called and responded to, with latency information captured for each.

A challenge with tracing in distributed systems is that information about the original frontend request isn't automatically or inherently carried forward when subsequent requests are made to various backend services. In Go, you can make requests with context (the cluster IP address and credentials), meaning additional information is loaded in the HTTP header. The concept of context is extended with OpenCensus to include span context, where you can include information about the parent span with each subsequent request. You can then append child spans to compose the overall trace, so you can see how the user request traversed the system and was eventually served back to the user.

Context isn't specific to Go. For more information about the languages where OpenCensus supports the SpanContext feature, see the OpenCensus feature matrix.

Costs

This tutorial uses the following billable components of Google Cloud:

To generate a cost estimate based on your projected usage, use the pricing calculator. New Google Cloud users might be eligible for a free trial.

When you finish this tutorial, you can avoid continued billing by deleting the resources you created. For more information, see Cleaning up.

Before you begin

  1. Sign in to your Google Account.

    If you don't already have one, sign up for a new account.

  2. In the Cloud Console, on the project selector page, select or create a Cloud project.

    Go to the project selector page

  3. Make sure that billing is enabled for your Google Cloud project. Learn how to confirm billing is enabled for your project.

  4. Enable the GKE, Cloud Trace, Cloud Build, Cloud Storage, and Container Registry APIs.

    Enable the APIs

Setting up your environment

In this section, you set up your environment with the tools you use throughout this tutorial. You run all the terminal commands in this tutorial from Cloud Shell.

  1. In the Cloud Console, activate Cloud Shell.

    Activate Cloud Shell

    At the bottom of the Cloud Console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Cloud SDK already installed, including the gcloud command-line tool, and with values already set for your current project. It can take a few seconds for the session to initialize.

  2. Set environment variables:
    export PROJECT_ID=$(gcloud config list --format 'value(core.project)' 2>/dev/null)
    
  3. Download the required files for this tutorial by cloning the following Git repository:
    git clone https://github.com/GoogleCloudPlatform/gke-observability-tutorials.git
    cd $HOME/gke-observability-tutorials/gke-opencensus-stackdriver/go/http
    WORKDIR=$(pwd)
    

    You make the repository folder your $WORKDIR from which you do all of the tasks related to this tutorial so you can delete the folder when you finish the tutorial.

Install tools

  1. In Cloud Shell, install kubectx and kubens.

    git clone https://github.com/ahmetb/kubectx $WORKDIR/kubectx
    export PATH=$PATH:$WORKDIR/kubectx
    

    You use these tools to work with multiple Kubernetes clusters, contexts, and namespaces.

  2. In Cloud Shell, install Apache Bench, an open source load-generation tool:

    sudo apt-get install apache2-utils
    

Creating GKE clusters

In this section, you create two GKE clusters where you deploy the sample app. GKE clusters are created with write-only access to the Cloud Trace API by default, so you don't need to define access when you create the clusters.

  1. In Cloud Shell, create the clusters:

    gcloud container clusters create backend-cluster \
        --zone=us-west1-a --enable-stackdriver-kubernetes \
        --verbosity=none --async
    gcloud container clusters create frontend-cluster \
        --zone=us-west1-a --enable-stackdriver-kubernetes \
        --verbosity=none
    

    In this tutorial, the clusters are in the us-west1-a zone. For more information, see Geography and regions.

  2. Get your cluster credentials and store them locally:

    gcloud container clusters get-credentials backend-cluster --zone=us-west1-a
    gcloud container clusters get-credentials frontend-cluster --zone=us-west1-a
    
  3. Rename the contexts of your clusters to make it easier to access them later in the tutorial:

    kubectx backend=gke_${PROJECT_ID}_us-west1-a_backend-cluster
    kubectx frontend=gke_${PROJECT_ID}_us-west1-a_frontend-cluster
    

Review OpenCensus instrumentation

In the following sections, you review the code of the sample app to learn how to use context propagation to allow spans from multiple requests to be appended to a single parent trace.

Review the frontend code

  • There are three imports in the frontend service code:

    import (
    	"context"
    	"fmt"
    	"log"
    	"net/http"
    	"os"
    	"time"
    
    	"contrib.go.opencensus.io/exporter/stackdriver"
    	"go.opencensus.io/trace"
    
    	"go.opencensus.io/plugin/ochttp"
    	"go.opencensus.io/plugin/ochttp/propagation/tracecontext"
    
    	"github.com/gorilla/mux"
    )
    
    • The go.opencensus.io/plugin/ochttp and go.opencensus.io/plugin/ochttp/propagation/tracecontext imports contain the ochttp plugin, which propagates context across requests. The context carries information about the trace to which subsequent spans are appended.
    • The contrib.go.opencensus.io/exporter/stackdriver import exports traces to Trace. For the list of backends that OpenCensus supports, see Exporters.
    • The github.com/gorilla/mux import is the library that the sample app uses for request handling.
  • The main() function sets up exporting traces to Trace and uses a mux router to handle requests made to the / URL:

    func main() {
    	// set up Stackdriver exporter
    	exporter, err := stackdriver.NewExporter(stackdriver.Options{ProjectID: projectID, Location: location})
    	if err != nil {
    		log.Fatal(err)
    	}
    	trace.RegisterExporter(exporter)
    	trace.ApplyConfig(trace.Config{
    		DefaultSampler: trace.AlwaysSample(),
    	})
    
    	// handle root request
    	r := mux.NewRouter()
    	r.HandleFunc("/", mainHandler)
    	var handler http.Handler = r
    	handler = &ochttp.Handler{
    		Handler:     handler,
    		Propagation: &tracecontext.HTTPFormat{}}
    
    	log.Fatal(http.ListenAndServe(":8081", handler))
    }
    
    • The handler is using HTTP propagation to add contexts to requests.
    • In this tutorial, the sampling is set to AlwaysSample. By default, OpenCensus samples traces at a predetermined rate, which you can control by using this parameter.
  • Review the mainHandler() function:

    func mainHandler(w http.ResponseWriter, r *http.Request) {
    	// create root span
    	ctx, rootspan := trace.StartSpan(context.Background(), "incoming call")
    	defer rootspan.End()
    
    	// create child span for backend call
    	ctx, childspan := trace.StartSpan(ctx, "call to backend")
    	defer childspan.End()
    
    	// create request for backend call
    	req, err := http.NewRequest("GET", backendAddr, nil)
    	if err != nil {
    		log.Fatalf("%v", err)
    	}
    
    	childCtx, cancel := context.WithTimeout(ctx, 1000*time.Millisecond)
    	defer cancel()
    	req = req.WithContext(childCtx)
    
    	// add span context to backend call and make request
    	format := &tracecontext.HTTPFormat{}
    	format.SpanContextToRequest(childspan.SpanContext(), req)
    	//format.SpanContextToRequest(rootspan.SpanContext(), req)
    	client := http.DefaultClient
    	res, err := client.Do(req)
    	if err != nil {
    		log.Fatalf("%v", err)
    	}
    
    	fmt.Printf("%v\n", res.StatusCode)
    }
    
    • To capture overall request latency, this function creates the root span. You append subsequent spans to the root span.

      // create root span
        ctx, rootspan := trace.StartSpan(context.Background(), "incoming call")
        defer rootspan.End()
      
    • To time the backend call, the function creates a child span.

      // create child span for backend call
        _, childspan := trace.StartSpan(ctx, "call to backend")
        defer childspan.End()
      
    • The function creates a request to the backend.

      // create request for backend call
        req, err := http.NewRequest("GET", backendAddr, nil)
        if err != nil {
          log.Fatalf("%v", err)
        }
        childCtx, cancel := context.WithTimeout(req.Context(), 1000*time.Millisecond)
        defer cancel()
        req = req.WithContext(childCtx)
      
    • The function adds the span context to that request.

      // add span context to backend call and make request
        format := &tracecontext.HTTPFormat{}
        format.SpanContextToRequest(rootspan.SpanContext(), req)
      

Review the backend code

  • The main() function for the backend code is identical to the main() function from the frontend service—it sets up the Trace exporter and uses a mux router to handle requests made to the / URL with the mainHandler() function.

    func main() {
    	// set up Stackdriver exporter
    	exporter, err := stackdriver.NewExporter(stackdriver.Options{ProjectID: projectID, Location: location})
    	if err != nil {
    		log.Fatal(err)
    	}
    	trace.RegisterExporter(exporter)
    	trace.ApplyConfig(trace.Config{
    		DefaultSampler: trace.AlwaysSample(),
    	})
    
    	// handle incoming request
    	r := mux.NewRouter()
    	r.HandleFunc("/", mainHandler)
    	var handler http.Handler = r
    	// handler = &logHandler{log: log, next: handler}
    
    	handler = &ochttp.Handler{
    		Handler:     handler,
    		Propagation: &tracecontext.HTTPFormat{}}
    
    	log.Fatal(http.ListenAndServe(":8080", handler))
    }
    
  • The mainHandler() function uses the incoming request to get both the HTTP context provided by Go and the trace context provided by OpenCensus. From there, the function creates a child span and appends to the trace context. Finally, it calls the callRemoteEndpoint() function to make another call whose latency needs to be captured.

    func mainHandler(w http.ResponseWriter, r *http.Request) {
    	// get context from incoming request
    	ctx := r.Context()
    	// get span context from incoming request
    	HTTPFormat := &tracecontext.HTTPFormat{}
    	if spanContext, ok := HTTPFormat.SpanContextFromRequest(r); ok {
    		_, span := trace.StartSpanWithRemoteParent(ctx, "call remote endpoint", spanContext)
    		defer span.End()
    		returnCode := callRemoteEndpoint()
    		fmt.Fprintf(w, returnCode)
    	}
    }
    
  • The callRemoteEndpoint() function uses the HTTP library to make a remote endpoint call. Because the tracing for this call is handled in the parent function, it doesn't need to consume the trace context or create more child spans.

    // make an outbound call
    func callRemoteEndpoint() string {
    	resp, err := http.Get(destURL)
    	if err != nil {
    		log.Fatal("could not fetch remote endpoint")
    	}
    	defer resp.Body.Close()
    	body, err := ioutil.ReadAll(resp.Body)
    	if err != nil {
    		log.Fatal("could not read response from Google")
    		log.Fatal(body)
    	}
    
    	return strconv.Itoa(resp.StatusCode)
    }
    

Deploy the app

In this section, you use Cloud Build to build container images for the backend and frontend services, and you deploy them to their GKE clusters.

Build the backend service

  1. In Cloud Shell, change to the backend directory:

    cd $WORKDIR/backend/
    
  2. Submit the build:

    gcloud builds submit . --tag=gcr.io/$PROJECT_ID/backend:latest
    
  3. Confirm that the container image was successfully created and is available in Container Registry:

    gcloud container images list
    

    The container image was successfully created when the output is similar to the following, where project-id is your Cloud project ID:

    NAME
    gcr.io/project-id/backend
    

Deploy the backend service

  1. In Cloud Shell, set your kubectx context to the backend cluster:

    kubectx backend
    
  2. Deploy the configmap file:

    export PROJECT_ID=$(gcloud info --format='value(config.project)')
    envsubst < backend-configmap.yaml | kubectl apply -f -
    
  3. Create the backend deployment file:

    envsubst < backend-deployment.yaml | kubectl apply -f -
    
  4. Confirm that the pods are running:

    kubectl get pods
    

    The output displays a Status of Running:

    NAME                       READY   STATUS    RESTARTS   AGE
    backend-645859d95b-7mx95   1/1     Running   0          52s
    backend-645859d95b-qfdnc   1/1     Running   0          52s
    backend-645859d95b-zsj5m   1/1     Running   0          52s
    
  5. Expose the backend deployment using a load balancer:

    kubectl expose deployment backend --type=LoadBalancer
    
  6. Get the IP address of the backend service:

    kubectl get services backend
    

    The output is similar to the following:

    NAME      TYPE           CLUSTER-IP     EXTERNAL-IP    PORT(S)          AGE
    backend   LoadBalancer   10.11.247.58   34.83.88.143   8080:30714/TCP   70s
    

    Repeat this command until the EXTERNAL-IP field of your service changes from <pending> to an IP address.

  7. Capture the IP address from the previous step as a variable:

    export BACKEND_IP=$(kubectl get svc backend -ojson | jq -r '.status.loadBalancer.ingress[].ip')
    

Build the frontend service

  1. In Cloud Shell, change to the frontend directory:

    cd $WORKDIR/frontend/
    
  2. Submit the build:

    gcloud builds submit . --tag=gcr.io/$PROJECT_ID/frontend:latest
    
  3. Confirm that the container image was successfully created and is available in Container Registry:

    gcloud container images list
    

    The container image was successfully created when the output is similar to the following:

    NAME
    gcr.io/qwiklabs-gcp-47a7dcba55b334f7/backend
    gcr.io/qwiklabs-gcp-47a7dcba55b334f7/frontend
    

Deploy the frontend service

  1. In Cloud Shell, set your kubectx context to the backend cluster:

    kubectx frontend
    
  2. Deploy the configmap file:

    export PROJECT_ID=$(gcloud info --format='value(config.project)')
    envsubst < frontend-configmap.yaml | kubectl apply -f -
    
  3. Create the frontend deployment file:

    envsubst < frontend-deployment.yaml | kubectl apply -f -
    
  4. Confirm that the pods are running:

    kubectl get pods
    

    The output displays a Status of Running:

    NAME                        READY   STATUS    RESTARTS   AGE
    frontend-747b445499-v7x2w   1/1     Running   0          57s
    frontend-747b445499-vwtmg   1/1     Running   0          57s
    frontend-747b445499-w47pf   1/1     Running   0          57s
    
  5. Expose the frontend deployment using a load balancer:

    kubectl expose deployment frontend --type=LoadBalancer
    
  6. Get the IP address of the frontend service:

    kubectl get services frontend
    

    The output is similar to the following:

    NAME       TYPE           CLUSTER-IP     EXTERNAL-IP     PORT(S)        AGE
    frontend   LoadBalancer   10.27.241.93   34.83.111.232   8081:31382/TCP   70s
    

    Repeat this command until the EXTERNAL-IP field of your service changes from <pending> to an IP address.

  7. Capture the IP address from the previous step as a variable:

    export FRONTEND_IP=$(kubectl get svc frontend -ojson | jq -r '.status.loadBalancer.ingress[].ip')
    

Load the app and review traces

In this section, you use the Apache Bench utility to create requests for your app and review the resulting traces in Trace.

  1. In Cloud Shell, use Apache Bench to generate 1000 requests with 3 concurrent threads:

    ab -c 3 -n 1000 http://${FRONTEND_IP}:8081/
    
  2. In the Cloud Console, go to the Trace List page.

    Go to Trace List page

  3. To review the timeline, click one of the URIs labeled as incoming call.

    Scatter plot graph of traces.

    This trace contains three spans with the following names:

    • The incoming call span captures the end-to-end latency observed by the client.
    • The call to backend span captures the latency of the backend call observed by the frontend.
    • The call remote endpoint span captures the latency of the external endpoint call observed by the backend.

    Bar graph of spans.

Cleaning up

The easiest way to eliminate billing is to delete the Cloud project you created for the tutorial. Alternatively, you can delete the individual resources.

Delete the project

  1. In the Cloud Console, go to the Manage resources page.

    Go to the Manage resources page

  2. In the project list, select the project that you want to delete and then click Delete .
  3. In the dialog, type the project ID and then click Shut down to delete the project.

What's next