Instrumentation and observability

Instrumentation refers to the specific code that generates or collects data about your application's runtime behavior. Inside of your application, instrumentation, such as that provided by OpenTelemetry, can collect domain specific information about the language runtime, framework, or application logic, and then send that data to a Google Cloud project or to some other destination. This data, which is also known as telemetry, includes metrics, logs, and traces.

System-level metrics such as CPU usage, memory usage, and disk usage are valuable for detecting problems with your application, but they don't provide much insight into application-level concerns. Instrumentation can help your application generate the data that you need to diagnose the root cause of a problem, because the resulting telemetry shows you what is happening inside of your application. For example, logs often include context about your program like a specific error message or stacktrace, and the location in your source code. Similarly, distributed traces help you understand how multiple services interact when processing requests. Metrics let you get notified when your application isn't behaving correctly.

Instrumenting your application involves generating telemetry and sending it to where the data can be stored and queried. For example, your instrumentation might send telemetry to a Google Cloud project. Services in Google Cloud Observability help you to collect, analyze, and correlate telemetry data. They also provide built-in defaults to help you get started faster such as default dashboards and alert policies. For more information about Google Cloud Observability, see Observability in Google Cloud.

The following figure illustrates how an application uses instrumentation to generate and send telemetry to a storage system:

Figure illustrating architecture of in process instrumentation.

As illustrated in previous figure, the instrumentation code exists within your application's process and interacts with the application to generate telemetry data. The instrumentation framework then exports your telemetry to a configured storage system. In the figure, the storage system is your Google Cloud project.

About vendor-neutral instrumentation frameworks

Even if you plan to send telemetry only to Google Cloud, we recommend that you use a vendor-neutral open source instrumentation framework to instrument your applications. These types of frameworks have some key benefits:

No vendor lock-in
Vendor-neutral frameworks aren't tied to any particular vendor and they provide their own data model for the generated telemetry. Therefore, you can send data to multiple vendors, and you usually can change which vendor you use without modifying your code.
Standardized procedures for collecting telemetry
Well-designed frameworks, such as OpenTelemetry, provide a standardized approach to collecting telemetry from applications. You can use the same framework for applications written in supported languages. And because the framework is standardized, you can collect and compare the telemetry from all of your services.
Interoperable libraries
Instrumentation frameworks include a rich ecosystem of libraries that gather telemetry signals, and these libraries are interoperable. For example, OpenTelemetry provides libraries to collect trace data and to collect metric data. You can use either library, or both libraries.

General recommendations

This section contains general recommendations about how to instrument your application. For guidance that is specific to Google Cloud, see Choose an instrumentation approach.

To collect metrics, we recommend that you use OpenTelemetry or Prometheus:

  • OpenTelemetry is an open source project that provides a unified framework for application instrumentation. It also provides instrumentation libraries for popular libraries. OpenTelemetry provides a standalone agent, the OpenTelemetry Collector, that can receive, transform, and export telemetry. The OpenTelemetry Collector configuration file determines the behavior of the OpenTelemetry Collector. To send telemetry to an agent or directly to a storage system, use the OpenTelemetry Protocol (OTLP).

  • Prometheus is a popular open source monitoring system. You can use the Prometheus client libraries to generate metrics from your application, and there is a third-party ecosystem of instrumentation libraries for popular frameworks. Prometheus clients expose their metrics as an HTTP endpoint that can be scraped by an agent.

To collect traces, we recommend that you use OpenTelemetry.

To collect logs, we recommend that you use a framework which can be configured to output JSON-structured logs for Cloud Logging. For writing log data, we recommend the following:

Google Cloud solutions

Google Cloud Observability provides flexible options for collecting telemetry:

What's next

For more information about Google Cloud Observability, see Observability in Google Cloud.