Deploy distributed tracing to observe microservice latency

Last reviewed 2023-08-11 UTC

This document shows how to deploy the reference architecture that's described in Use distributed tracing to observe microservice latency. The deployment that's illustrated in this document captures trace information on microservice applications using OpenTelemetry and Cloud Trace.

The sample application in this deployment is composed of two microservices that are written in Go.

This document assumes you're familiar with the following:

Objectives

  • Create a GKE cluster and deploy a sample application.
  • Review OpenTelemetry instrumentation code.
  • Review traces and logs generated by the instrumentation.

Architecture

The following diagram shows the architecture that you deploy.

Architecture of deployment with two GKE clusters.

You use Cloud Build—a fully managed continuous integration, delivery, and deployment platform—to build container images from the sample code and store them in Artifact Registry. The GKE clusters pull the images from Artifact Registry at deployment time.

The frontend service accepts HTTP requests on the / URL and calls the backend service. The address of the backend service is defined by an environment variable.

The backend service accepts HTTP requests on the /URL and makes an outbound call to an external URL as defined in an environment variable. After the external call is completed, the backend service returns the HTTP status call (for example, 200) to the caller.

Costs

In this document, you use the following billable components of Google Cloud:

To generate a cost estimate based on your projected usage, use the pricing calculator. New Google Cloud users might be eligible for a free trial.

When you finish the tasks that are described in this document, you can avoid continued billing by deleting the resources that you created. For more information, see Clean up.

Before you begin

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Make sure that billing is enabled for your Google Cloud project.

  4. Enable the GKE, Cloud Trace, Cloud Build, Cloud Storage, and Artifact Registry APIs.

    Enable the APIs

  5. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  6. Make sure that billing is enabled for your Google Cloud project.

  7. Enable the GKE, Cloud Trace, Cloud Build, Cloud Storage, and Artifact Registry APIs.

    Enable the APIs

Set up your environment

In this section, you set up your environment with the tools that you use throughout the deployment. You run all the terminal commands in this deployment from Cloud Shell.

  1. In the Google Cloud console, activate Cloud Shell.

    Activate Cloud Shell

    At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

  2. Set an environment variable to the ID of your Google Cloud project:
    export PROJECT_ID=$(gcloud config list --format 'value(core.project)' 2>/dev/null)
    
  3. Download the required files for this deployment by cloning the associated Git repository:
    git clone https://github.com/GoogleCloudPlatform/kubernetes-engine-samples.git
    cd kubernetes-engine-samples/observability/distributed-tracing
    WORKDIR=$(pwd)
    

    You make the repository folder the working directory ($WORKDIR) from which you do all of the tasks that are related to this deployment. That way, if you don't want to keep the resources, you can delete the folder when you finish the deployment.

Install tools

  1. In Cloud Shell, install kubectx and kubens:

    git clone https://github.com/ahmetb/kubectx $WORKDIR/kubectx
    export PATH=$PATH:$WORKDIR/kubectx
    

    You use these tools to work with multiple Kubernetes clusters, contexts, and namespaces.

  2. In Cloud Shell, install Apache Bench, an open source load-generation tool:

    sudo apt-get install apache2-utils
    

Create a Docker repository

Create a Docker repository to store the sample image for this deployment.

Console

  1. In the Google Cloud console, open the Repositories page.

    Open the Repositories page

  2. Click Create Repository.

  3. Specify distributed-tracing-docker-repo as the repository name.

  4. Choose Docker as the format and Standard as the mode.

  5. Under Location Type, select Region and then choose the location us-west1.

  6. Click Create.

The repository is added to the repository list.

gcloud

  1. In Cloud Shell, create a new Docker repository named distributed-tracing-docker-repo in the location us-west1 with the description docker repository:

    gcloud artifacts repositories create distributed-tracing-docker-repo --repository-format=docker \
    --location=us-west1 --description="Docker repository for distributed tracing deployment"
    
  2. Verify that the repository was created:

    gcloud artifacts repositories list
    

Create GKE clusters

In this section, you create two GKE clusters where you deploy the sample application. GKE clusters are created with write-only access to the Cloud Trace API by default, so you don't need to define access when you create the clusters.

  1. In Cloud Shell, create the clusters:

    gcloud container clusters create backend-cluster \
        --zone=us-west1-a \
        --verbosity=none --async
    
    gcloud container clusters create frontend-cluster \
        --zone=us-west1-a \
        --verbosity=none
    

    In this example, the clusters are in the us-west1-a zone. For more information, see Geography and regions.

  2. Get the cluster credentials and store them locally:

    gcloud container clusters get-credentials backend-cluster --zone=us-west1-a
    gcloud container clusters get-credentials frontend-cluster --zone=us-west1-a
    
  3. Rename the contexts of the clusters to make it easier to access them later in the deployment:

    kubectx backend=gke_${PROJECT_ID}_us-west1-a_backend-cluster
    kubectx frontend=gke_${PROJECT_ID}_us-west1-a_frontend-cluster
    

Review OpenTelemetry instrumentation

In the following sections, you review the code from the main.go file in the sample application. This helps you learn how to use context propagation to allow spans from multiple requests to be appended to a single parent trace.

Review the imports in the application code

import (
	"context"
	"fmt"
	"io/ioutil"
	"log"
	"net/http"
	"os"
	"strconv"

	cloudtrace "github.com/GoogleCloudPlatform/opentelemetry-operations-go/exporter/trace"
	"github.com/gorilla/mux"
	"go.opentelemetry.io/contrib/detectors/gcp"
	"go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp"
	"go.opentelemetry.io/contrib/propagators/autoprop"
	"go.opentelemetry.io/otel"
	"go.opentelemetry.io/otel/sdk/resource"
	"go.opentelemetry.io/otel/sdk/trace"
)

Note the following about the imports:

  • The go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp package contains the otelhttp plugin, which can instrument an HTTP server or HTTP client. Server instrumentation retrieves the span context from the HTTP request and records a span for the server's handling of the request. Client instrumentation injects the span context into the outgoing HTTP request and records a span for the time that's spent waiting for a response.
  • The go.opentelemetry.io/contrib/propagators/autoprop package provides an implementation of the OpenTelemetry TextMapPropagator interface, which is used by otelhttp to handle propagation. Propagators determine the format and keys that are used to store the trace context in transports like HTTP. Specifically, otelhttp passes HTTP headers to the propagator. The propagator extracts a span context into a Go context from the headers, or it encodes and injects a span context in the Go context into headers (depending on whether it is client or server). By default, the autoprop package injects and extracts the span context using W3C trace context propagation format.
  • The github.com/GoogleCloudPlatform/opentelemetry-operations-go/exporter/trace import exports traces to Cloud Trace.
  • The github.com/gorilla/mux import is the library that the sample application uses for request handling.
  • The go.opentelemetry.io/contrib/detectors/gcp import adds attributes to spans, such as cloud.availability_zone, which identify where your application is running inside Google Cloud.
  • The go.opentelemetry.io/otel, go.opentelemetry.io/otel/sdk/trace, and go.opentelemetry.io/otel/sdk/resource imports that are used to set up OpenTelemetry.

Review the main function

The main function sets up trace export to Cloud Trace and uses a mux router to handle requests that are made to the / URL.

func main() {
	ctx := context.Background()
	// Set up the Cloud Trace exporter.
	exporter, err := cloudtrace.New()
	if err != nil {
		log.Fatalf("cloudtrace.New: %v", err)
	}
	// Identify your application using resource detection.
	res, err := resource.New(ctx,
		// Use the GCP resource detector to detect information about the GKE Cluster.
		resource.WithDetectors(gcp.NewDetector()),
		resource.WithTelemetrySDK(),
	)
	if err != nil {
		log.Fatalf("resource.New: %v", err)
	}
	tp := trace.NewTracerProvider(
		trace.WithBatcher(exporter),
		trace.WithResource(res),
	)
	// Set the global TracerProvider which is used by otelhttp to record spans.
	otel.SetTracerProvider(tp)
	// Flush any pending spans on shutdown.
	defer tp.ForceFlush(ctx)

	// Set the global Propagators which is used by otelhttp to propagate
	// context using the w3c traceparent and baggage formats.
	otel.SetTextMapPropagator(autoprop.NewTextMapPropagator())

	// Handle incoming request.
	r := mux.NewRouter()
	r.HandleFunc("/", mainHandler)
	var handler http.Handler = r

	// Use otelhttp to create spans and extract context for incoming http
	// requests.
	handler = otelhttp.NewHandler(handler, "server")
	log.Fatal(http.ListenAndServe(fmt.Sprintf(":%v", os.Getenv("PORT")), handler))
}

Note the following about this code:

  • You configure an OpenTelemetry TracerProvider, which detects attributes when it runs on Google Cloud, and which exports traces to Cloud Trace.
  • You use the otel.SetTracerProvider and otel.SetTextMapPropagators functions to set the global TracerProvider and Propagator settings. By default, instrumentation libraries such as otelhttp use the globally-registered TracerProvider to create spans and the Propagator to propagate context.
  • You wrap the HTTP server with otelhttp.NewHandler to instrument the HTTP server.

Review the mainHandler function

func mainHandler(w http.ResponseWriter, r *http.Request) {
	// Use otelhttp to record a span for the outgoing call, and propagate
	// context to the destination.
	destination := os.Getenv("DESTINATION_URL")
	resp, err := otelhttp.Get(r.Context(), destination)
	if err != nil {
		log.Fatal("could not fetch remote endpoint")
	}
	defer resp.Body.Close()
	_, err = ioutil.ReadAll(resp.Body)
	if err != nil {
		log.Fatalf("could not read response from %v", destination)
	}

	fmt.Fprint(w, strconv.Itoa(resp.StatusCode))
}

To capture the latency of outbound requests that are made to the destination, you use the otelhttp plugin to make an HTTP request. You also use the r.Context function to link the incoming request with the outgoing request, as shown in the following listing:

// Use otelhttp to record a span for the outgoing call, and propagate
// context to the destination.
resp, err := otelhttp.Get(r.Context(), destination)

Deploy the application

In this section, you use Cloud Build to build container images for the backend and frontend services. You then deploy them to their GKE clusters.

Build the Docker container

  1. In Cloud Shell, submit the build from the working directory:

    cd $WORKDIR
    gcloud builds submit . --tag us-west1-docker.pkg.dev/$PROJECT_ID/distributed-tracing-docker-repo/backend:latest
    
  2. Confirm that the container image was successfully created and is available in Artifact Registry:

    gcloud artifacts docker images list us-west1-docker.pkg.dev/$PROJECT_ID/distributed-tracing-docker-repo
    

    The container image was successfully created when the output is similar to the following, where PROJECT_ID is the ID of your Google Cloud project:

    NAME
    us-west1-docker.pkg.dev/PROJECT_ID/distributed-tracing-docker-repo/backend
    

Deploy the backend service

  1. In Cloud Shell, set the kubectx context to the backend cluster:

    kubectx backend
    
  2. Create the YAML file for the backend deployment:

    export PROJECT_ID=$(gcloud info --format='value(config.project)')
    envsubst < backend-deployment.yaml | kubectl apply -f -
    
  3. Confirm that the pods are running:

    kubectl get pods
    

    The output displays a Status value of Running:

    NAME                       READY   STATUS    RESTARTS   AGE
    backend-645859d95b-7mx95   1/1     Running   0          52s
    backend-645859d95b-qfdnc   1/1     Running   0          52s
    backend-645859d95b-zsj5m   1/1     Running   0          52s
    
  4. Expose the backend deployment using a load balancer:

    kubectl expose deployment backend --type=LoadBalancer
    
  5. Get the IP address of the backend service:

    kubectl get services backend
    

    The output is similar to the following:

    NAME      TYPE           CLUSTER-IP     EXTERNAL-IP    PORT(S)          AGE
    backend   LoadBalancer   10.11.247.58   34.83.88.143   8080:30714/TCP   70s
    

    If the value of the EXTERNAL-IP field is <pending>, repeat the command until the value is an IP address.

  6. Capture the IP address from the previous step in a variable:

    export BACKEND_IP=$(kubectl get svc backend -ojson | jq -r '.status.loadBalancer.ingress[].ip')
    

Deploy the frontend service

  1. In Cloud Shell, set the kubectx context to the backend cluster:

    kubectx frontend
    
  2. Create the YAML file for the frontend deployment:

    export PROJECT_ID=$(gcloud info --format='value(config.project)')
    envsubst < frontend-deployment.yaml | kubectl apply -f -
    
  3. Confirm that the pods are running:

    kubectl get pods
    

    The output displays a Status value of Running:

    NAME                        READY   STATUS    RESTARTS   AGE
    frontend-747b445499-v7x2w   1/1     Running   0          57s
    frontend-747b445499-vwtmg   1/1     Running   0          57s
    frontend-747b445499-w47pf   1/1     Running   0          57s
    
  4. Expose the frontend deployment using a load balancer:

    kubectl expose deployment frontend --type=LoadBalancer
    
  5. Get the IP address of the frontend service:

    kubectl get services frontend
    

    The output is similar to the following:

    NAME       TYPE           CLUSTER-IP     EXTERNAL-IP     PORT(S)         AGE
    frontend   LoadBalancer   10.27.241.93   34.83.111.232   8081:31382/TCP  70s
    

    If the value of the EXTERNAL-IP field is <pending>, repeat the command until the value is an IP address.

  6. Capture the IP address from the previous step in a variable:

    export FRONTEND_IP=$(kubectl get svc frontend -ojson | jq -r '.status.loadBalancer.ingress[].ip')
    

Load the application and review traces

In this section, you use the Apache Bench utility to create requests for your application. You then review the resulting traces in Cloud Trace.

  1. In Cloud Shell, use Apache Bench to generate 1,000 requests using 3 concurrent threads:

    ab -c 3 -n 1000 http://${FRONTEND_IP}:8081/
    
  2. In the Google Cloud console, go to the Trace List page.

    Go to Trace List

  3. To review the timeline, click one of the URIs that's labeled as server.

    Scatter plot graph of traces.

    This trace contains four spans that have the following names:

    • The first server span captures the end-to-end latency of handling the HTTP request in the frontend server.
    • The first HTTP GET span captures the latency of the GET call that's made by the frontend's client to the backend.
    • The second server span captures the end-to-end latency of handling the HTTP request in the backend server.
    • The second HTTP GET span captures the latency of the GET call that's made by the backend's client to google.com.

    Bar graph of spans.

Clean up

The easiest way to eliminate billing is to delete the Google Cloud project that you created for the deployment. Alternatively, you can delete the individual resources.

Delete the project

  1. In the Google Cloud console, go to the Manage resources page.

    Go to Manage resources

  2. In the project list, select the project that you want to delete, and then click Delete.
  3. In the dialog, type the project ID, and then click Shut down to delete the project.

Delete individual resources

To delete individual resources instead of deleting the whole project, run the following commands in Cloud Shell:

gcloud container clusters delete frontend-cluster --zone=us-west1-a
gcloud container clusters delete backend-cluster --zone=us-west1-a
gcloud artifacts repositories delete distributed-tracing-docker-repo --location us-west1

What's next