Get started with the OpenTelemetry Collector

This document describes how to set up the OpenTelemetry Collector to scrape standard Prometheus metrics and report those metrics to Google Cloud Managed Service for Prometheus. The OpenTelemetry Collector is an agent that you can deploy yourself and configure to export to Managed Service for Prometheus. The set-up is similar to running Managed Service for Prometheus with self-deployed collection.

You might choose the OpenTelemetry Collector over self-deployed collection for the following reasons:

  • The OpenTelemetry Collector allows you to route your telemetry data to multiple backends by configuring different exporters in your pipeline.
  • The Collector also supports signals from metrics, logs, and traces, so by using it you can handle all three signal types in one agent.
  • OpenTelemetry's vendor-agnostic data format (the OpenTelemetry Protocol, or OTLP) supports a strong ecosystem of libraries and pluggable Collector components. This allows for a range of customizability options for receiving, processing, and exporting your data.

The trade-off for these benefits is that running an OpenTelemetry Collector requires a self-managed deployment and maintenance approach. Which approach you choose will depend on your specific needs, but in this document we offer recommended guidelines for configuring the OpenTelemetry Collector using Managed Service for Prometheus as a backend.

Before you begin

This section describes the configuration needed for the tasks described in this document.

Set up projects and tools

To use Google Cloud Managed Service for Prometheus, you need the following resources:

  • A Google Cloud project with the Cloud Monitoring API enabled.

    • If you don't have a Google Cloud project, then do the following:

      1. In the Google Cloud console, go to New Project:

        Create a New Project

      2. In the Project Name field, enter a name for your project and then click Create.

      3. Go to Billing:

        Go to Billing

      4. Select the project you just created if it isn't already selected at the top of the page.

      5. You are prompted to choose an existing payments profile or to create a new one.

      The Monitoring API is enabled by default for new projects.

    • If you already have a Google Cloud project, then ensure that the Monitoring API is enabled:

      1. Go to APIs & services:

        Go to APIs & services

      2. Select your project.

      3. Click Enable APIs and Services.

      4. Search for "Monitoring".

      5. In the search results, click through to "Cloud Monitoring API".

      6. If "API enabled" is not displayed, then click the Enable button.

  • A Kubernetes cluster. If you do not have a Kubernetes cluster, then follow the instructions in the Quickstart for GKE.

You also need the following command-line tools:

  • gcloud
  • kubectl

The gcloud and kubectl tools are part of the Google Cloud CLI. For information about installing them, see Managing Google Cloud CLI components. To see the gcloud CLI components you have installed, run the following command:

gcloud components list

Configure your environment

To avoid repeatedly entering your project ID or cluster name, perform the following configuration:

  • Configure the command-line tools as follows:

    • Configure the gcloud CLI to refer to the ID of your Google Cloud project:

      gcloud config set project PROJECT_ID
      
    • Configure the kubectl CLI to use your cluster:

      kubectl config set-cluster CLUSTER_NAME
      

    For more information about these tools, see the following:

Set up a namespace

Create the NAMESPACE_NAME Kubernetes namespace for resources you create as part of the example application:

kubectl create ns NAMESPACE_NAME

Verify service account credentials

You can skip this section if your Kubernetes cluster has Workload Identity enabled.

When running on GKE, Managed Service for Prometheus automatically retrieves credentials from the environment based on the Compute Engine default service account. The default service account has the necessary permissions, monitoring.metricWriter and monitoring.viewer, by default. If you do not use Workload Identity, and you have previously removed either of those roles from the default node service account, you will have to re-add those missing permissions before continuing.

If you are not running on GKE, see Provide credentials explicitly.

Configure a service account for Workload Identity

You can skip this section if your Kubernetes cluster does not have Workload Identity enabled.

Managed Service for Prometheus captures metric data by using the Cloud Monitoring API. If your cluster is using Workload Identity, you must grant your Kubernetes service account permission to the Monitoring API. This section describes the following:

Create and bind the service account

This step appears in several places in the Managed Service for Prometheus documentation. If you have already performed this step as part of a prior task, then you do not need to repeat it. Skip ahead to Authorize the service account.

The following command sequence creates the gmp-test-sa service account and binds it to the default Kubernetes service account in the NAMESPACE_NAME namespace:

gcloud config set project PROJECT_ID \
&&
gcloud iam service-accounts create gmp-test-sa \
&&
gcloud iam service-accounts add-iam-policy-binding \
  --role roles/iam.workloadIdentityUser \
  --member "serviceAccount:PROJECT_ID.svc.id.goog[NAMESPACE_NAME/default]" \
  gmp-test-sa@PROJECT_ID.iam.gserviceaccount.com \
&&
kubectl annotate serviceaccount \
  --namespace NAMESPACE_NAME \
  default \
  iam.gke.io/gcp-service-account=gmp-test-sa@PROJECT_ID.iam.gserviceaccount.com

If you are using a different GKE namespace or service account, adjust the commands appropriately.

Authorize the service account

Groups of related permissions are collected into roles, and you grant the roles to a principal, in this example, the Google Cloud service account. For more information about Monitoring roles, see Access control.

The following command grants the Google Cloud service account, gmp-test-sa, the Monitoring API roles it needs to write metric data.

If you have already granted the Google Cloud service account a specific role as part of prior task, then you don't need to do it again.

gcloud projects add-iam-policy-binding PROJECT_ID\
  --member=serviceAccount:gmp-test-sa@PROJECT_ID.iam.gserviceaccount.com \
  --role=roles/monitoring.metricWriter

Workload Identity in production environments

The example described in this document binds the Google Cloud service account to the default Kubernetes service account and gives Google Cloud service account all necessary permission to use the Monitoring API.

In a production environment, you might want to use a finer-grained approach, with a service account for each component, each with minimal permissions. For more information on configuring service accounts for workload-identity management, see Using Workload Identity.

Set up the OpenTelemetry Collector

This section guides you through setting up and using the OpenTelemetry collector to scrape metrics from an example application and send the data to Google Cloud Managed Service for Prometheus. For detailed configuration information, see the following sections:

The OpenTelemetry Collector is analogous to the Managed Service for Prometheus agent binary. The OpenTelemetry community regularly publishes releases including source code, binaries, and container images.

You can either deploy these artifacts on VMs or Kubernetes clusters using the best-practice defaults, or you can use the collector builder to build your own collector consisting of only the components you need. To build a collector for use with Managed Service for Prometheus, you need the following components:

  • The Managed Service for Prometheus exporter, which writes your metrics to Managed Service for Prometheus.
  • A receiver to scrape your metrics. This document assumes that you are using the OpenTelemetry Prometheus receiver, but the Managed Service for Prometheus exporter is compatible with any OpenTelemetry metrics receiver.
  • Processors to batch and mark up your metrics to include important resource identifiers depending on your environment.

These components are enabled by using a configuration file that is passed to the Collector with the --config flag.

The following sections discuss how to configure each of these components in more detail. This document describes how to run the collector on GKE and elsewhere.

Configure and deploy the Collector

Whether you are running your collection on Google Cloud or in another environment, you can still configure the OpenTelemetry Collector to export to Managed Service for Prometheus. The biggest difference will be in how you configure the Collector. In non-Google Cloud environments, there may be additional formatting of metric data that is needed for it to be compatible with Managed Service for Prometheus. On Google Cloud, however, much of this formatting can be automatically detected by the Collector.

Run the OpenTelemetry Collector on GKE

You can copy the following config into a file called config.yaml to set up the OpenTelemetry Collector on GKE:

receivers:
  prometheus:
    config:
      scrape_configs:
      - job_name: 'SCRAPE_JOB_NAME'
        kubernetes_sd_configs:
        - role: pod
        relabel_configs:
        - source_labels: [__meta_kubernetes_pod_label_app_kubernetes_io_name]
          action: keep
          regex: prom-example
        - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
          action: replace
          target_label: __metrics_path__
          regex: (.+)
        - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
          action: replace
          regex: (.+):(?:\d+);(\d+)
          replacement: $1:$2
          target_label: __address__
        - action: labelmap
          regex: __meta_kubernetes_pod_label_(.+)

processors:
  resourcedetection:
    detectors: [gcp]
    timeout: 10s

  transform:
    # "location", "cluster", "namespace", "job", "instance", and "project_id" are reserved, and
    # metrics containing these labels will be rejected.  Prefix them with exported_ to prevent this.
    metric_statements:
    - context: datapoint
      statements:
      - set(attributes["exported_location"], attributes["location"])
      - delete_key(attributes, "location")
      - set(attributes["exported_cluster"], attributes["cluster"])
      - delete_key(attributes, "cluster")
      - set(attributes["exported_namespace"], attributes["namespace"])
      - delete_key(attributes, "namespace")
      - set(attributes["exported_job"], attributes["job"])
      - delete_key(attributes, "job")
      - set(attributes["exported_instance"], attributes["instance"])
      - delete_key(attributes, "instance")
      - set(attributes["exported_project_id"], attributes["project_id"])
      - delete_key(attributes, "project_id")

  batch:
    # batch metrics before sending to reduce API usage
    send_batch_max_size: 200
    send_batch_size: 200
    timeout: 5s

  memory_limiter:
    # drop metrics if memory usage gets too high
    check_interval: 1s
    limit_percentage: 65
    spike_limit_percentage: 20

# Note that the googlemanagedprometheus exporter block is intentionally blank
exporters:
  googlemanagedprometheus:

service:
  pipelines:
    metrics:
      receivers: [prometheus]
      processors: [batch, memory_limiter, resourcedetection, transform]
      exporters: [googlemanagedprometheus]

The preceding config uses the Prometheus receiver and the Managed Service for Prometheus exporter to scrape the metrics endpoints on Kubernetes Pods and export those metrics to Managed Service for Prometheus. The pipeline processors format and batch the data.

For more details on what each part of this config does, along with configurations for different platforms, see the detailed sections below on scraping metrics and adding processors.

When running in clusters with an existing Prometheus configuration, replace any $ characters with $$ to avoid triggering environment variable substitution. For more information, see Scrape Prometheus metrics.

You can modify this config based on your environment, provider, and the metrics you want to scrape, but the example config is a recommended starting point for running on GKE.

Run the OpenTelemetry Collector outside Google Cloud

Running the OpenTelemetry Collector outside Google Cloud, such as on-premises or on other cloud providers, is similar to running the Collector on GKE. However, the metrics you scrape are less likely to automatically include data that best formats it for Managed Service for Prometheus. Therefore, you must take extra care to configure the collector to format the metrics so they are compatible with Managed Service for Prometheus.

You can the following config into a file called config.yaml to set up the OpenTelemetry Collector for deployment on a non-GKE Kubernetes cluster:

receivers:
  prometheus:
    config:
      scrape_configs:
      - job_name: 'SCRAPE_JOB_NAME'
        kubernetes_sd_configs:
        - role: pod
        relabel_configs:
        - source_labels: [__meta_kubernetes_pod_label_app_kubernetes_io_name]
          action: keep
          regex: prom-example
        - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
          action: replace
          target_label: __metrics_path__
          regex: (.+)
        - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
          action: replace
          regex: (.+):(?:\d+);(\d+)
          replacement: $1:$2
          target_label: __address__
        - action: labelmap
          regex: __meta_kubernetes_pod_label_(.+)

processors:
  resource:
    attributes:
    - key: "cluster"
      value: "CLUSTER_NAME"
      action: upsert
    - key: "namespace"
      value: "NAMESPACE_NAME"
      action: upsert
    - key: "location"
      value: "REGION"
      action: upsert

  transform:
    # "location", "cluster", "namespace", "job", "instance", and "project_id" are reserved, and
    # metrics containing these labels will be rejected.  Prefix them with exported_ to prevent this.
    metric_statements:
    - context: datapoint
      statements:
      - set(attributes["exported_location"], attributes["location"])
      - delete_key(attributes, "location")
      - set(attributes["exported_cluster"], attributes["cluster"])
      - delete_key(attributes, "cluster")
      - set(attributes["exported_namespace"], attributes["namespace"])
      - delete_key(attributes, "namespace")
      - set(attributes["exported_job"], attributes["job"])
      - delete_key(attributes, "job")
      - set(attributes["exported_instance"], attributes["instance"])
      - delete_key(attributes, "instance")
      - set(attributes["exported_project_id"], attributes["project_id"])
      - delete_key(attributes, "project_id")

  batch:
    # batch metrics before sending to reduce API usage
    send_batch_max_size: 200
    send_batch_size: 200
    timeout: 5s

  memory_limiter:
    # drop metrics if memory usage gets too high
    check_interval: 1s
    limit_percentage: 65
    spike_limit_percentage: 20

exporters:
  googlemanagedprometheus:
    project: "PROJECT_ID"

service:
  pipelines:
    metrics:
      receivers: [prometheus]
      processors: [batch, memory_limiter, resource, transform]
      exporters: [googlemanagedprometheus]

This config does the following:

When running in clusters with an existing Prometheus configuration, replace any $ characters with $$ to avoid triggering environment variable substitution. For more information, see Scrape Prometheus metrics.

For information about best practices for configuring the Collector on other clouds, see Amazon EKS or Azure AKS.

Deploy the example application

The example application emits the example_requests_total counter metric and the example_random_numbers histogram metric (among others) on its metrics port. The manifest for this example defines three replicas.

To deploy the example application, run the following command:

kubectl -n NAMESPACE_NAME apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/prometheus-engine/v0.7.0/examples/example-app.yaml

Create your collector config as a ConfigMap

After you have created your config and placed it in a file called config.yaml, use that file to create a Kubernetes ConfigMap based on your config.yaml file. When the collector is deployed, it mounts the ConfigMap and loads the file.

To create a ConfigMap named otel-config with your config, use the following command:

kubectl -n NAMESPACE_NAME create configmap otel-config --from-file config.yaml

Deploy the collector

Create a file called collector-deployment.yaml with the following content:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: NAMESPACE_NAME:prometheus-test
rules:
- apiGroups: [""]
  resources:
  - pods
  verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: NAMESPACE_NAME:prometheus-test
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: NAMESPACE_NAME:prometheus-test
subjects:
- kind: ServiceAccount
  namespace: NAMESPACE_NAME
  name: default
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: otel-collector
spec:
  replicas: 1
  selector:
    matchLabels:
      app: otel-collector
  template:
    metadata:
      labels:
        app: otel-collector
    spec:
      containers:
      - name: otel-collector
        image: otel/opentelemetry-collector-contrib:0.74.0
        args:
        - --config
        - /etc/otel/config.yaml
        volumeMounts:
        - mountPath: /etc/otel/
          name: otel-config
      volumes:
      - name: otel-config
        configMap:
          name: otel-config

Create the Collector deployment in your Kubernetes cluster by running the following command:

kubectl -n NAMESPACE_NAME create -f collector-deployment.yaml

After the pod starts, it scrapes the sample application and reports metrics to Managed Service for Prometheus.

For information about ways to query your data, see Configure a query interface.

Provide credentials explicitly

When running on GKE, the OpenTelemetry Collector automatically retrieves credentials from the environment based on the node's service account. In non-GKE Kubernetes clusters, credentials must be explicitly provided to the OpenTelemetry Collector by using flags or the GOOGLE_APPLICATION_CREDENTIALS environment variable.

  1. Set the context to your target project:

    gcloud config set project PROJECT_ID
    
  2. Create a service account:

    gcloud iam service-accounts create gmp-test-sa
    

    This step creates the service account that you might have already created in the Workload Identity instructions.

  3. Grant the required permissions to the service account:

    gcloud projects add-iam-policy-binding PROJECT_ID\
      --member=serviceAccount:gmp-test-sa@PROJECT_ID.iam.gserviceaccount.com \
      --role=roles/monitoring.metricWriter
    

  4. Create and download a key for the service account:

    gcloud iam service-accounts keys create gmp-test-sa-key.json \
      --iam-account=gmp-test-sa@PROJECT_ID.iam.gserviceaccount.com
    
  5. Add the key file as a secret to your non-GKE cluster:

    kubectl -n NAMESPACE_NAME create secret generic gmp-test-sa \
      --from-file=key.json=gmp-test-sa-key.json
    

  6. Open the OpenTelemetry Deployment resource for editing:

    kubectl -n NAMESPACE_NAME edit deployment otel-collector
    
  7. Add the text shown in bold to the resource:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      namespace: NAMESPACE_NAME
      name: otel-collector
    spec:
      template
        spec:
          containers:
          - name: otel-collector
            env:
            - name: "GOOGLE_APPLICATION_CREDENTIALS"
              value: "/gmp/key.json"
    ...
            volumeMounts:
            - name: gmp-sa
              mountPath: /gmp
              readOnly: true
    ...
          volumes:
          - name: gmp-sa
            secret:
              secretName: gmp-test-sa
    ...
    

  8. Save the file and close the editor. After the change is applied, the pods are re-created and start authenticating to the metric backend with the given service account.

Scrape Prometheus metrics

This section and the subsequent section provide additional customization information for using the OpenTelemetry Collector. This information might be helpful in certain situations, but none of it is necessary to run the example described in Set up the OpenTelemetry Collector.

If your applications are already exposing Prometheus endpoints, the OpenTelemetry Collector can scrape those endpoints using the same scrape config format you would use with any standard Prometheus config. To do this, enable the Prometheus receiver in your collector config.

A simple Prometheus receiver config for Kubernetes pods might look like the following:

receivers:
  prometheus:
    config:
      scrape_configs:
      - job_name: 'kubernetes-pods'
        kubernetes_sd_configs:
        - role: pod
        relabel_configs:
        - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
          action: keep
          regex: true
        - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
          action: replace
          target_label: __metrics_path__
          regex: (.+)
        - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
          action: replace
          regex: (.+):(?:\d+);(\d+)
          replacement: $1:$2
          target_label: __address__
        - action: labelmap
          regex: __meta_kubernetes_pod_label_(.+)

service:
  pipelines:
    metrics:
      receivers: [prometheus]

This is a simple service discovery-based scrape config that you can modify as needed to scrape your applications.

When running in clusters with an existing Prometheus configuration, replace any $ characters with $$ to avoid triggering environment variable substitution. This is especially important to do for the replacement value within your relabel_configs section. For example, if you have the following relabel_config section:

- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
  action: replace
  regex: (.+):(?:\d+);(\d+)
  replacement: $1:$2
  target_label: __address__

Then rewrite it to be:

- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
  action: replace
  regex: (.+):(?:\d+);(\d+)
  replacement: $$1:$$2
  target_label: __address__

For more information, see the OpenTelemetry documentation.

Next, we strongly recommend that you use processors to format your metrics. In many cases, processors must be used to properly format your metrics.

Add processors

OpenTelemetry processors modify telemetry data before it is exported. You can use the processors below to ensure that your metrics are written in a format compatible with Managed Service for Prometheus.

Detect resource attributes

The Managed Service for Prometheus exporter for OpenTelemetry uses the prometheus_target monitored resource to uniquely identify time series data points. The exporter parses the required monitored-resource fields from resource attributes on the metric data points. The fields and the attributes from which the values are scraped are:

  • project_id: auto-detected by Application Default Credentials, gcp.project.id, or project in exporter config (see configuring the exporter)
  • location: location, cloud.availability_zone, cloud.region
  • cluster: cluster, k8s.cluster_name
  • namespace: namespace, k8s.namespace_name
  • job: service.name + service.namespace
  • instance: service.instance.id

Failure to set these labels to unique values can result in "duplicate timeseries" errors when exporting to Managed Service for Prometheus.

The Prometheus receiver automatically sets the service.name attribute based on the job_name in the scrape config, and service.instance.id attribute based on the scrape target's instance. The receiver also sets k8s.namespace.name when using role: pod in the scrape config.

We recommend populating the other attributes automatically using the resource detection processor. However, depending on your environment, some attributes might not be automatically detectable. In this case, you can use other processors to either manually insert these values or parse them from metric labels. The following sections illustration configurations for doing this processing on various platforms

GKE

When running OpenTelemetry on GKE, you only need to enable the resource-detection processor to fill out the resource labels. Be sure that your metrics don't already contain any of the reserved resource labels. If this is unavoidable, see Avoid resource attribute collisions by renaming attributes.

processors:
  resourcedetection:
    detectors: [gcp]
    timeout: 10s

This section can be copied directly into your config file, replacing the processors section if it already exists.

Amazon EKS

The EKS resource detector does not automatically fill in the cluster or namespace attributes. You can provide these values manually by using the resource processor, as shown in the following example:

processors:
  resourcedetection:
    detectors: [eks]
    timeout: 10s

  resource:
    attributes:
    - key: "cluster"
      value: "my-eks-cluster"
      action: upsert
    - key: "namespace"
      value: "my-app"
      action: upsert

You can also convert these values from metric labels using the groupbyattrs processor (see move metric labels to resource labels below).

Azure AKS

The AKS resource detector does not automatically fill in the cluster or namespace attributes. You can provide these values manually by using the resource processor, as shown in the following example:

processors:
  resourcedetection:
    detectors: [aks]
    timeout: 10s

  resource:
    attributes:
    - key: "cluster"
      value: "my-eks-cluster"
      action: upsert
    - key: "namespace"
      value: "my-app"
      action: upsert

You can also convert these values from metric labels by using the groupbyattrs processor; see Move metric labels to resource labels.

On-premises and non-cloud environments

With on-premises or non-cloud environments, you probably can't detect any of the necessary resource attributes automatically. In this case, you can emit these labels in your metrics and move them to resource attributes (see Move metric labels to resource labels), or manually set all of the resource attributes as shown in the following example:

processors:
  resource:
    attributes:
    - key: "cluster"
      value: "my-on-prem-cluster"
      action: upsert
    - key: "namespace"
      value: "my-app"
      action: upsert
    - key: "location"
      value: "us-east-1"
      action: upsert

Create your collector config as a ConfigMap describes how to use the config. That section assumes you have put your config in a file called config.yaml.

The project_id resource attribute can still be automatically set when running the Collector with Application Default Credentials. If your Collector does not have access to Application Default Credentials, see Setting project_id.

Alternatively, you can manually set the resource attributes you need in an environment variable, OTEL_RESOURCE_ATTRIBUTES, with a comma-separated list of key/value pairs, for example:

export OTEL_RESOURCE_ATTRIBUTES="cluster=my-cluster,namespace=my-app,location=us-east-1"

Then use the env resource detector processor to set the resource attributes:

processors:
  resourcedetection:
    detectors: [env]

Avoid resource attribute collisions by renaming attributes

If your metrics already contain labels that collide with the required resource attributes (such as location, cluster, or namespace), rename them to avoid the collision. The Prometheus convention is to add the prefix exported_ to the label name. To add this prefix, use the transform processor.

The following processors config renames any potential collisions and resolves any conflicting keys from the metric:

processors:
  transform:
    # "location", "cluster", "namespace", "job", "instance", and "project_id" are reserved, and
    # metrics containing these labels will be rejected.  Prefix them with exported_ to prevent this.
    metric_statements:
    - context: datapoint
      statements:
      - set(attributes["exported_location"], attributes["location"])
      - delete_key(attributes, "location")
      - set(attributes["exported_cluster"], attributes["cluster"])
      - delete_key(attributes, "cluster")
      - set(attributes["exported_namespace"], attributes["namespace"])
      - delete_key(attributes, "namespace")
      - set(attributes["exported_job"], attributes["job"])
      - delete_key(attributes, "job")
      - set(attributes["exported_instance"], attributes["instance"])
      - delete_key(attributes, "instance")
      - set(attributes["exported_project_id"], attributes["project_id"])
      - delete_key(attributes, "project_id")

Move metric labels to resource labels

In some cases, your metrics might be intentionally reporting labels such as namespace because your exporter is monitoring multiple namespaces. For example, when running the kube-state-metrics exporter.

In this scenario, these labels can be moved to resource attributes using the groupbyattrs processor:

processors:
  groupbyattrs:
    keys:
    - namespace
    - cluster
    - location

In the above example, given a metric with the labels namespace, cluster, and/or location, those labels will be converted to the matching resource attributes.

Limit API requests and memory usage

Two other processors, the batch processor and memory limiter processor allow you to limit the resource consumption of your collector.

Batch processing

Batching requests lets you define how many data points to send in a single request. Note that Cloud Monitoring has a limit of 200 time series per request. Enable the batch processor by using the following settings:

processors:
  batch:
    # batch metrics before sending to reduce API usage
    send_batch_max_size: 200
    send_batch_size: 200
    timeout: 5s

Memory limiting

We recommend enabling the memory-limiter processor to prevent your collector from crashing at times of high throughput. Enable the processing by using the following settings:

processors:
  memory_limiter:
    # drop metrics if memory usage gets too high
    check_interval: 1s
    limit_percentage: 65
    spike_limit_percentage: 20

Configure the googlemanagedprometheus exporter

By default, using the googlemanagedprometheus exporter on GKE requires no additional configuration. For many use cases you only need to enable it with an empty block in the exporters section:

exporters:
  googlemanagedprometheus:

However, the exporter does provide some optional configuration settings. The following sections describe the other configuration settings.

Setting project_id

To associate your time series with a Google Cloud project, the prometheus_target monitored resource must have project_id set.

When running OpenTelemetry on Google Cloud, the Managed Service for Prometheus exporter defaults to setting this value based on the Application Default Credentials it finds. If no credentials are available, or you want to override the default project, you have two options:

  • Set project in the exporter config
  • Add a gcp.project.id resource attribute to your metrics.

We strongly recommend using the default (unset) value for project_id rather than explicitly setting it, when possible.

Set project in the exporter config

The following config excerpt sends metrics to Managed Service for Prometheus in the Google Cloud project MY_PROJECT:

receivers:
  prometheus:
    config:
    ...

processors:
  resourcedetection:
    detectors: [gcp]
    timeout: 10s

exporters:
  googlemanagedprometheus:
    project: MY_PROJECT

service:
  pipelines:
    metrics:
      receivers: [prometheus]
      processors: [resourcedetection]
      exporters: [googlemanagedprometheus]

The only change from previous examples is the new line project: MY_PROJECT. This setting is useful if you know that every metric coming through this Collector should be sent to MY_PROJECT.

Set gcp.project.id resource attribute

You can set project association on a per-metric basis by adding a gcp.project.id resource attribute to your metrics. Set the value of the attribute to the name of the project the metric should be associated with.

For example, if your metric already has a label project, this label can be moved to a resource attribute and renamed to gcp.project.id by using processors in the Collector config, as shown in the following example:

receivers:
  prometheus:
    config:
    ...

processors:
  resourcedetection:
    detectors: [gcp]
    timeout: 10s

  groupbyattrs:
    keys:
    - project

  resource:
    attributes:
    - key: "gcp.project.id"
      from_attribute: "project"
      action: upsert

exporters:
  googlemanagedprometheus:

service:
  pipelines:
    metrics:
      receivers: [prometheus]
      processors: [resourcedetection, groupbyattrs, resource]
      exporters: [googlemanagedprometheus]

Setting client options

The googlemanagedprometheus exporter uses gRPC clients for Managed Service for Prometheus. Therefore, optional settings are available for configuring the gRPC client:

  • compression: Enables gzip compression for gRPC requests, which is useful for minimizing egress fees when sending data from other clouds to Managed Service for Prometheus (valid values: gzip).
  • user_agent: Overrides the user-agent string sent on requests to Cloud Monitoring; currently only applies to metrics. Defaults to the build and version number of your OpenTelemetry Collector, for example, opentelemetry-collector-contrib 0.74.0.
  • endpoint: Sets the endpoint to which metric data is going to be sent.
  • use_insecure: If true, uses gRPC as the communication transport. Has an effect only when the endpoint value is not "".
  • grpc_pool_size: Sets the size of the connection pool in the gRPC client.
  • prefix: Configures the prefix of metrics sent to Managed Service for Prometheus. Defaults to prometheus.googleapis.com. Don't change this prefix; doing so causes metrics to not be queryable with PromQL in the Cloud Monitoring UI.

In most cases, you don't need to change these values from their defaults. However, you can change them to accommodate special circumstances.

All of these settings are set under a metric block in the googlemanagedprometheus exporter section, as shown in the following example:

receivers:
  prometheus:
    config:
    ...

processors:
  resourcedetection:
    detectors: [gcp]
    timeout: 10s

exporters:
  googlemanagedprometheus:
    metric:
      compression: gzip
      user_agent: opentelemetry-collector-contrib 0.74.0
      endpoint: ""
      use_insecure: false
      grpc_pool_size: 1
      prefix: prometheus.googleapis.com

service:
  pipelines:
    metrics:
      receivers: [prometheus]
      processors: [resourcedetection]
      exporters: [googlemanagedprometheus]

What's next

  • Query the Prometheus metrics collected by using the service.
  • Set up the OpenTelemetry Collector as a sidecar agent in Cloud Run.
  • The Cloud Monitoring Metrics Management page provides information that can help you control the amount you spend on chargeable metrics without affecting observability. The Metrics Management page reports the following information:

    • Ingestion volumes for both byte- and sample-based billing, across metric domains and for individual metrics.
    • Data about labels and cardinality of metrics.
    • Use of metrics in alerting policies and custom dashboards.
    • Rate of metric-write errors.
    For more information about the Metrics Management page, see View metric usage and diagnostics.