Observability with Envoy
This document demonstrates how to generate tracing and logging for the Envoy proxy. It also shows you how to export the information to Cloud Trace and Cloud Logging.
Using a service mesh gives you the ability to observe traffic to and from services, which allows for richer monitoring and debugging without code changes in the service itself. In the sidecar proxy architecture that Cloud Service Mesh uses, the proxy is the component that processes requests and provides the necessary telemetry information. Telemetry information must be collected and stored in a centralized location for further use, such as data analysis, alerting, and troubleshooting.
Demonstration setup
This document uses the following configuration to demonstrate tracing and logging:
- A single application that listens on the HTTP port and returns the hostname of the virtual machine (VM) instance that served the request. In the diagram, this application is in the upper-right corner, labeled HTTP service(s) (10.10.10.10:80). One or more VMs can provide this service.
- A single Compute Engine VM running a consumer of this service. In the diagram, this is labeled Demo Compute Engine VM.
- An Envoy sidecar proxy installed and configured by Cloud Service Mesh. In the diagram, this is labeled envoy.
- A service consumer application, shown in the box on the left, is the
consumer of the HTTP service running on
10.10.10.10:80
.
The following steps correspond to the numbered labels in the diagram:
Cloud Service Mesh configures the Envoy proxy to do the following:
- Load balance traffic for the
10.10.10.10:80
service. - Store access log information for each request issued for this service.
- Generate tracing information for the service.
- Load balance traffic for the
After the consumer sends a request to
10.10.10.10
, the sidecar proxy routes the request to the correct destination.The sidecar proxy also generates the necessary telemetry information:
- Adds an entry to the access log on the local disk with additional information about the request.
- Generates a trace entry and sends it to Trace by using OpenCensus Envoy tracing.
The Logging agent exports this data to the Cloud Logging API so that the data becomes available in the Cloud Logging interface.
Prerequisites
Before you complete the setup steps, ensure that the following is done:
- The Traffic Director API is enabled and other prerequisites are met, as described in Prepare to set up with VM and proxyless workloads.
- The Cloud Trace API is enabled.
- The service account that the Compute Engine VM uses has the
following Identity and Access Management (IAM) roles configured:
- Cloud Trace Agent role
(
roles/cloudtrace.agent
) - Logs Writer role
(
roles/logging.logWriter
)
- Cloud Trace Agent role
(
- The firewall rules allow traffic to the VM that you configure as part of this setup.
Set up the demonstration service and Cloud Service Mesh
This document uses several shell scripts to perform the steps required to configure the demonstration service. Review the scripts to understand the specific steps that they perform.
Start a Compute Engine VM and configure the HTTP service on the VM:
curl -sSO https://storage.googleapis.com/traffic-director/demo/observability/setup_demo_service.sh chmod 755 setup_demo_service.sh && ./setup_demo_service.sh
The
setup_demo_service.sh
script creates a VM template that launches apache2 when a VM starts and a managed instance group that uses this template. The script launches a single instance without autoscaling enabled.Use Cloud Service Mesh to configure routing for the
10.10.10.10
service:curl -sSO https://storage.googleapis.com/traffic-director/demo/observability/setup_demo_trafficdirector.sh chmod 755 setup_demo_trafficdirector.sh && ./setup_demo_trafficdirector.sh
The
setup_demo_trafficdirector.sh
script configures the necessary parameters for the Cloud Service Mesh managed service.Start a Compute Engine VM that runs a consumer of the HTTP service, with the sidecar proxy installed and configured on the VM. In the following command, replace
PROJECT_ID
with the project ID to which Trace information should be sent. This is typically the same Google Cloud project to which your VM belongs.curl -sSO https://storage.googleapis.com/traffic-director/demo/observability/setup_demo_client.sh chmod 755 setup_demo_client.sh && ./setup_demo_client.sh PROJECT_ID
The
setup_demo_client.sh
script creates a Compute Engine VM that has an Envoy proxy preconfigured to use Cloud Service Mesh.
The following additional configuration settings enable tracing and logging:
- The
TRAFFICDIRECTOR_ACCESS_LOG_PATH
andTRAFFICDIRECTOR_ENABLE_TRACING
bootstrap node metadata variables enable logging and tracing, as described in Configure Envoy bootstrap attributes for Cloud Service Mesh. - Static bootstrap configuration enables export of trace information to Trace by using OpenCensus.
After running these scripts, you can sign in to the td-observability-demo-client
VM and access the HTTP service available at 10.10.10.10
:
curl http://10.10.10.10
At this point, Envoy generates access logging and tracing information. The following section describes how to export logs and tracing information.
Set up trace export to Cloud Trace
The Envoy bootstrap configuration that you created when you ran the
setup-demo-client.sh
script is sufficient to generate tracing information.
All other configuration is optional. If you want to configure additional
parameters, see the
OpenCensus Envoy configuration page
and modify the tracing options in the Envoy bootstrap configuration.
After you issue a sample request to the demonstration server (curl
10.10.10.10
), in the Google Cloud console, go to the Trace interface
(Trace > Trace list).
You see a trace record that corresponds to the request that you issued.
For more information about how to use Trace, see the Cloud Trace documentation.
Set up access log export to Logging
At this stage, Envoy should be recording access log information to the local disk of the VM where it is running. To export these records to Logging, you must install the Logging agent locally. This requires installing and configuring the Logging agent.
Install the Logging agent
Install the Logging agent on the VM from which
logging information is exported. For this example configuration, the VM is
td-observability-demo-vm
.
curl -sSO https://dl.google.com/cloudagents/add-logging-agent-repo.sh sudo bash add-logging-agent-repo.sh --also-install
For more information, see Install the Cloud Logging agent on a single VM.
Configure the Logging agent
You can export the Envoy logs as either unstructured or structured text.
Export the Envoy logs as unstructured text
This option exports log records from the access log to Cloud Logging as raw text. Each entry in the access log is exported as a single entry to Logging. This configuration is easier to install because it relies on a parser that is distributed with the current version of the Logging agent. However, it is more difficult to filter and process raw text log entries when using this option.
Download and install the Envoy access log unstructured export configuration file:
curl -sSO https://storage.googleapis.com/traffic-director/demo/observability/envoy_access_fluentd_unstructured.conf sudo cp envoy_access_fluentd_unstructured.conf /etc/google-fluentd/config.d/envoy_access.conf
Restart the agent; the changes take effect when the agent starts up:
sudo service google-fluentd restart
Export the Envoy logs as structured text
Install the Envoy access log parser from GitHub:
sudo /opt/google-fluentd/embedded/bin/gem install fluent-plugin-envoy-parser
Download and install the configuration file for exporting Envoy access logs in a structured format:
curl -sSO https://storage.googleapis.com/traffic-director/demo/observability/envoy_access_fluentd_structured.conf sudo cp envoy_access_fluentd_structured.conf /etc/google-fluentd/config.d/envoy_access.conf
Restart the agent; the changes take effect when the agent starts up:
sudo service google-fluentd restart
For more information, see Configure the Logging agent.
Verify the configuration
- From the sidecar proxy VM, generate a request to the demonstration service.
This creates a new local log record. For example, you can run
curl 10.10.10.10
. - In the Google Cloud console, go to Logging > Logs Explorer. In the drop-down menu, select the envoy-access log type. You see a log entry for the most recent request in the unstructured or structured format, depending on the configuration type that you chose earlier.
Troubleshooting
Read the following sections for information about how to troubleshoot different issues with observability with Envoy.
Configure tracing across multiple projects
If you would like to trace requests across Envoys deployed in multiple projects, note the following:
- Each Envoy must be configured with the credentials of the project where it is running.
- Each Envoy sends trace data to the project that corresponds to the credentials it is running with.
- You can see tracing spans for cross-project requests if your applications
preserve the value of the
X-Cloud-Trace-Context
HTTP header when requests are made.
Trace compatibility with proxyless gRPC applications
Envoy's OpenCensus tracer configuration allows traces exported from proxyless
gRPC applications and Envoy proxies to be fully compatible within a service
mesh. For compatibility, the Envoy bootstrap must configure the trace context to
include the GRPC_TRACE_BIN
trace format in its OpenCensusConfig
, as follows:
tracing: http: name: envoy.tracers.opencensus typed_config: "@type": type.googleapis.com/envoy.config.trace.v2.OpenCensusConfig stackdriver_exporter_enabled: "true" stackdriver_project_id: "PROJECT_ID" incoming_trace_context: ["CLOUD_TRACE_CONTEXT", "GRPC_TRACE_BIN"] outgoing_trace_context: ["CLOUD_TRACE_CONTEXT", "GRPC_TRACE_BIN"]
If configuration is complete, but you don't see trace or logging entries available, verify the following:
- The service accounts for the Compute Engine VM have the necessary Trace and Logging IAM permissions, as specified in the prerequisites. For information about Trace IAM permissions, see Access control. For information about Logging permissions, see Access control.
- For logging: Ensure that there are no errors in
/var/log/google-fluentd/google-fluentd.log
. - For logging: Ensure that new entries appear in the local access log file when requests are issued.
What's next
- To find related information, see Observability with proxyless gRPC applications.
- To find general Cloud Service Mesh troubleshooting information, see Troubleshooting deployments that use Envoy.