OpenTelemetry Collector を使ってみる

このドキュメントでは、OpenTelemetry Collector を設定して標準の Prometheus 指標をスクレイピングし、それらの指標を Google Cloud Managed Service for Prometheus に報告する方法について説明します。OpenTelemetry Collector は、自身でデプロイし、Managed Service for Prometheus にエクスポートするように構成できるエージェントです。この設定は、セルフデプロイモードでの収集で Managed Service for Prometheus を実行する場合と似ています。

セルフデプロイモードでの収集よりも OpenTelemetry Collector を優先的に選択する理由としては、次のようなものがあります。

OpenTelemetry Collector を使用すると、パイプラインで異なるエクスポータを構成して、テレメトリーデータを複数のバックエンドにルーティングできます。
Collector は指標、ログ、トレースからのシグナルもサポートしているため、1 つのエージェントで 3 つのシグナルタイプをすべて処理できます。
OpenTelemetry のベンダーに依存しないデータ形式（OpenTelemetry Protocol、OTLP）は、ライブラリとのプラグインに対応する Collector コンポーネントの強力なエコシステムをサポートしています。これにより、データを受信、処理、エクスポートするためのさまざまなカスタマイズオプションを利用できます。

このような利点はありますが、OpenTelemetry Collector を実行するには、セルフマネージド型のデプロイメントとメンテナンスのアプローチが必要になります。どの方法を選択するかは特定のニーズによって異なります。このドキュメントでは、Managed Service for Prometheus をバックエンドとして使用し、OpenTelemetry Collector を構成する際に推奨されるガイドラインについて説明します。

始める前に

このセクションでは、このドキュメントで説明するタスクに必要な構成について説明します。

プロジェクトとツールを設定する

Google Cloud Managed Service for Prometheus を使用するには、次のリソースが必要です。

Cloud Monitoring API が有効になっている Google Cloud プロジェクト。
- Google Cloud プロジェクトが存在しない場合は、以下の操作を行います。
  1. Google Cloud コンソールで [新しいプロジェクト] に移動します。
    
    新しいプロジェクトを作成
  2. [プロジェクト名] フィールドにプロジェクトの名前を入力して、[作成] をクリックします。
  3. [お支払い] に移動します。
    
    [お支払い] に移動
  4. 作成したプロジェクトをまだ選択していない場合は、ページ上部でプロジェクトを選択します。
  5. 既存のお支払いプロファイルを選択するか、新しいお支払いプロファイルを作成するように求められます。
  新しいプロジェクトでは、Monitoring API がデフォルトで有効になっています。
- Google Cloud プロジェクトがすでに存在する場合は、Monitoring API が有効になっていることを確認します。
  1. [API とサービス] に移動します。
    
    [API とサービス] に移動
  2. プロジェクトを選択します。
  3. [API とサービスの有効化] をクリックします。
  4. 「Monitoring」を検索します。
  5. 検索結果で、[Cloud Monitoring API] をクリックします。
  6. [API が有効です] と表示されていない場合は、[有効にする] をクリックします。
Kubernetes クラスタ。Kubernetes クラスタがない場合は、GKE のクイックスタートの手順を行います。

また、次のコマンドラインツールも必要です。

gcloud
kubectl

gcloud ツールと kubectl ツールは Google Cloud CLI に含まれています。インストールの詳細については、Google Cloud CLI コンポーネントの管理をご覧ください。インストールされている gcloud CLI コンポーネントを確認するには、次のコマンドを実行します。

gcloud components list

環境を構成する

プロジェクト ID またはクラスタ名を繰り返し入力しないようにするには、次の構成を行います。

コマンドラインツールを次のように構成します。
- Google Cloud プロジェクトの ID を参照するように gcloud CLI を構成します。
```
gcloud config set project PROJECT_ID
```
- クラスタを使用するように kubectl CLI を構成します。
```
kubectl config set-cluster CLUSTER_NAME
```
これらのツールの詳細については、以下をご覧ください。
- gcloud CLI の概要
- kubectl コマンド

名前空間を設定する

サンプルアプリケーションの一部として作成するリソースに NAMESPACE_NAME Kubernetes Namespace を作成します。

kubectl create ns NAMESPACE_NAME

サービスアカウントの認証情報を確認する

Kubernetes クラスタで Workload Identity Federation for GKE が有効になっている場合は、このセクションをスキップできます。

GKE で実行すると、Managed Service for Prometheus は Compute Engine のデフォルトのサービスアカウントに基づいて環境から認証情報を自動的に取得します。デフォルトのサービスアカウントには、必要な権限である monitoring.metricWriter と monitoring.viewer がデフォルトで付与されています。Workload Identity Federation for GKE を使用しておらず、以前にいずれかのロールをデフォルトのノードサービスアカウントから削除している場合は、続行する前に、不足している権限を再度追加する必要があります。

GKE で実行していない場合は、認証情報を明示的に提供するをご覧ください。

Workload Identity Federation for GKE 用のサービスアカウントを構成する

Kubernetes クラスタで Workload Identity Federation for GKE が有効になっていない場合は、このセクションをスキップできます。

Managed Service for Prometheus は、Cloud Monitoring API を使用して指標データをキャプチャします。クラスタで Workload Identity Federation for GKE を使用している場合は、Kubernetes サービスアカウントに Monitoring API の権限を付与する必要があります。このセクションでは、次のことを説明します。

専用の Google Cloud サービスアカウント gmp-test-sa を作成する。
テスト用の名前空間 NAMESPACE_NAME のデフォルトの Kubernetes サービスアカウントに Google Cloud サービスアカウントをバインドする。
Google Cloud サービスアカウントに必要な権限を付与する。

サービスアカウントを作成してバインドする

この手順は、Managed Service for Prometheus のドキュメントの複数の場所で説明されています。前のタスクですでに行っている場合は、この手順を繰り返す必要はありません。サービスアカウントを承認するに進んでください。

次のコマンドシーケンスでは、gmp-test-sa サービスアカウントを作成し、NAMESPACE_NAME 名前空間でデフォルトの Kubernetes サービスアカウントにバインドします。

gcloud config set project PROJECT_ID \
&&
gcloud iam service-accounts create gmp-test-sa \
&&
gcloud iam service-accounts add-iam-policy-binding \
  --role roles/iam.workloadIdentityUser \
  --member "serviceAccount:PROJECT_ID.svc.id.goog[NAMESPACE_NAME/default]" \
  gmp-test-sa@PROJECT_ID.iam.gserviceaccount.com \
&&
kubectl annotate serviceaccount \
  --namespace NAMESPACE_NAME \
  default \
  iam.gke.io/gcp-service-account=gmp-test-sa@PROJECT_ID.iam.gserviceaccount.com

別の GKE 名前空間またはサービスアカウントを使用している場合は、コマンドを適宜調整してください。

サービスアカウントを承認する

ロールには関連する権限がまとめられています。このロールをプリンシパル（この例では Google Cloud サービスアカウント）に付与します。Monitoring のロールの詳細については、アクセス制御をご覧ください。

次のコマンドは、Google Cloud サービスアカウント gmp-test-sa に、指標データの書き込みに必要な Monitoring API のロールを付与します。

前のタスクで Google Cloud サービスアカウントに特定のロールを付与している場合は、再度付与する必要はありません。

gcloud projects add-iam-policy-binding PROJECT_ID\
  --member=serviceAccount:gmp-test-sa@PROJECT_ID.iam.gserviceaccount.com \
  --role=roles/monitoring.metricWriter

Workload Identity Federation for GKE の構成をデバッグする

Workload Identity Federation for GKE の動作に問題がある場合は、Workload Identity Federation for GKE の設定の確認と Workload Identity Federation for GKE のトラブルシューティングガイドをご覧ください。

Workload Identity Federation for GKE の構成で最も一般的なエラーの原因は入力ミスや、部分的なコピー / 貼り付けです。これらの手順のコードサンプルに埋め込まれた編集可能な変数と、クリック可能なコピー / 貼り付けアイコンを使用することを強くおすすめします。

本番環境での Workload Identity Federation for GKE

このドキュメントの例では、Google Cloud サービスアカウントをデフォルトの Kubernetes サービスアカウントにバインドし、Monitoring API を使用するために必要なすべての権限を Google Cloud サービスアカウントに付与しています。

本番環境では、各コンポーネントのサービスアカウントを最小権限で使用し、よりきめ細かいアプローチを使用する必要があります。Workload Identity 管理のサービスアカウントを構成する詳細については、Workload Identity Federation for GKE の使用をご覧ください。

OpenTelemetry Collector を設定する

このセクションでは、OpenTelemetry Collector を設定して使用し、サンプルアプリケーションから指標をスクレイピングして、データを Google Cloud Managed Service for Prometheus に送信する方法について説明します。詳細な構成情報については、次のセクションをご覧ください。

Prometheus 指標をスクレイピングする
プロセッサを追加する
googlemanagedprometheus エクスポータを構成する

OpenTelemetry Collector は Managed Service for Prometheus エージェントバイナリに似ています。OpenTelemetry コミュニティでは、ソースコード、バイナリ、コンテナイメージなどのリリースが定期的に公開されています。

これらのアーティファクトは、ベストプラクティスのデフォルトを使用して VM または Kubernetes クラスタにデプロイできます。また、コレクタビルダーを使用して、必要なコンポーネントのみで構成される独自のコレクタを構築することもできます。Managed Service for Prometheus で使用するコレクタを構築するには、次のコンポーネントが必要です。

Managed Service for Prometheus エクスポータ。Managed Service for Prometheus に指標を書き込みます。
指標をスクレイピングするレシーバー。このドキュメントでは、OpenTelemetry Prometheus レシーバーを使用していることを前提としていますが、Managed Service for Prometheus エクスポータは任意の OpenTelemetry 指標レシーバーと互換性があります。
プロセッサ。指標をバッチ処理してマークアップして、環境に応じて重要なリソースの ID を含めます。

これらのコンポーネントを有効にするには、構成ファイルを使用します（このファイルを --config フラグで Collector に渡します）。

以降のセクションで、これらの各コンポーネントを構成する方法について詳しく説明します。このドキュメントでは、GKE とその他の場所でコレクタを実行する方法について説明します。

Collector を構成してデプロイする

コレクションを Google Cloud で実行しているか、別の環境で実行しているかにかかわらず、OpenTelemetry Collector を Managed Service for Prometheus にエクスポートするように構成できます。最も大きな違いは Collector の構成にあります。Google Cloud 以外の環境では、Managed Service for Prometheus との互換性を維持するために、指標データに追加のフォーマットが必要になることがあります。しかし、Google Cloud では、このフォーマットの多くがコレクタによって自動的に検出されます。

GKE で OpenTelemetry Collector を実行する

次の構成ファイルを config.yaml というファイルにコピーして、GKE で OpenTelemetry Collector を設定できます。

receivers:
  prometheus:
    config:
      scrape_configs:
      - job_name: 'SCRAPE_JOB_NAME'
        kubernetes_sd_configs:
        - role: pod
        relabel_configs:
        - source_labels: [__meta_kubernetes_pod_label_app_kubernetes_io_name]
          action: keep
          regex: prom-example
        - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
          action: replace
          target_label: __metrics_path__
          regex: (.+)
        - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
          action: replace
          regex: (.+):(?:\d+);(\d+)
          replacement: $$1:$$2
          target_label: __address__
        - action: labelmap
          regex: __meta_kubernetes_pod_label_(.+)

processors:
  resourcedetection:
    detectors: [gcp]
    timeout: 10s

  transform:
    # "location", "cluster", "namespace", "job", "instance", and "project_id" are reserved, and
    # metrics containing these labels will be rejected.  Prefix them with exported_ to prevent this.
    metric_statements:
    - context: datapoint
      statements:
      - set(attributes["exported_location"], attributes["location"])
      - delete_key(attributes, "location")
      - set(attributes["exported_cluster"], attributes["cluster"])
      - delete_key(attributes, "cluster")
      - set(attributes["exported_namespace"], attributes["namespace"])
      - delete_key(attributes, "namespace")
      - set(attributes["exported_job"], attributes["job"])
      - delete_key(attributes, "job")
      - set(attributes["exported_instance"], attributes["instance"])
      - delete_key(attributes, "instance")
      - set(attributes["exported_project_id"], attributes["project_id"])
      - delete_key(attributes, "project_id")

  batch:
    # batch metrics before sending to reduce API usage
    send_batch_max_size: 200
    send_batch_size: 200
    timeout: 5s

  memory_limiter:
    # drop metrics if memory usage gets too high
    check_interval: 1s
    limit_percentage: 65
    spike_limit_percentage: 20

# Note that the googlemanagedprometheus exporter block is intentionally blank
exporters:
  googlemanagedprometheus:

service:
  pipelines:
    metrics:
      receivers: [prometheus]
      processors: [batch, memory_limiter, resourcedetection, transform]
      exporters: [googlemanagedprometheus]

上記の構成では、Prometheus レシーバーと Managed Service for Prometheus エクスポータを使用して、Kubernetes Pod の指標エンドポイントをスクレイピングし、これらの指標を Managed Service for Prometheus にエクスポートしています。パイプラインプロセッサは、データのフォーマットとバッチ処理を行います。

この構成ファイルの各部分の機能と、さまざまなプラットフォームの構成について詳しくは、指標のスクレイピングとプロセッサの追加の各セクションをご覧ください。

OpenTelemetry Collector の prometheus レシーバで既存の Prometheus 構成を使用する場合は、環境変数の置換がトリガーされないように、$ 記号をすべて $$ to avoid triggering environment variable substitution. For more information, see Scrape Prometheus metrics.

You can modify this config based on your environment, provider, and the metrics you want to scrape, but the example config is a recommended starting point for running on GKE.

Run the OpenTelemetry Collector outside Google Cloud

Running the OpenTelemetry Collector outside Google Cloud, such as on-premises or on other cloud providers, is similar to running the Collector on GKE. However, the metrics you scrape are less likely to automatically include data that best formats it for Managed Service for Prometheus. Therefore, you must take extra care to configure the collector to format the metrics so they are compatible with Managed Service for Prometheus.

You can the following config into a file called config.yaml to set up the OpenTelemetry Collector for deployment on a non-GKE Kubernetes cluster:

receivers:
  prometheus:
    config:
      scrape_configs:
      - job_name: 'SCRAPE_JOB_NAME'
        kubernetes_sd_configs:
        - role: pod
        relabel_configs:
        - source_labels: [__meta_kubernetes_pod_label_app_kubernetes_io_name]
          action: keep
          regex: prom-example
        - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
          action: replace
          target_label: __metrics_path__
          regex: (.+)
        - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
          action: replace
          regex: (.+):(?:\d+);(\d+)
          replacement: $$1:$$2
          target_label: __address__
        - action: labelmap
          regex: __meta_kubernetes_pod_label_(.+)

processors:
  resource:
    attributes:
    - key: "cluster"
      value: "CLUSTER_NAME"
      action: upsert
    - key: "namespace"
      value: "NAMESPACE_NAME"
      action: upsert
    - key: "location"
      value: "REGION"
      action: upsert

  transform:
    # "location", "cluster", "namespace", "job", "instance", and "project_id" are reserved, and
    # metrics containing these labels will be rejected.  Prefix them with exported_ to prevent this.
    metric_statements:
    - context: datapoint
      statements:
      - set(attributes["exported_location"], attributes["location"])
      - delete_key(attributes, "location")
      - set(attributes["exported_cluster"], attributes["cluster"])
      - delete_key(attributes, "cluster")
      - set(attributes["exported_namespace"], attributes["namespace"])
      - delete_key(attributes, "namespace")
      - set(attributes["exported_job"], attributes["job"])
      - delete_key(attributes, "job")
      - set(attributes["exported_instance"], attributes["instance"])
      - delete_key(attributes, "instance")
      - set(attributes["exported_project_id"], attributes["project_id"])
      - delete_key(attributes, "project_id")

  batch:
    # batch metrics before sending to reduce API usage
    send_batch_max_size: 200
    send_batch_size: 200
    timeout: 5s

  memory_limiter:
    # drop metrics if memory usage gets too high
    check_interval: 1s
    limit_percentage: 65
    spike_limit_percentage: 20

exporters:
  googlemanagedprometheus:
    project: "PROJECT_ID"

service:
  pipelines:
    metrics:
      receivers: [prometheus]
      processors: [batch, memory_limiter, resource, transform]
      exporters: [googlemanagedprometheus]

This config does the following:

Sets up a Kubernetes service discovery scrape config for Prometheus. For more information, see scraping Prometheus metrics.
Manually sets cluster, namespace, and location resource attributes. For more information about resource attributes, including resource detection for Amazon EKS and Azure AKS, see Detect resource attributes.
Sets the project option in the googlemanagedprometheus exporter. For more information about the exporter, see Configure the googlemanagedprometheus exporter.

When using an existing Prometheus configuration with the OpenTelemetry Collector's prometheus receiver, replace any $ characters with $$ に置き換えます。詳細については、Prometheus 指標をスクレイピングするをご覧ください。

他のクラウドで Collector を構成するためのベストプラクティスについては、Amazon EKS または Azure AKS をご覧ください。

サンプルアプリケーションをデプロイする

このサンプルアプリケーションでは、metrics ポートに example_requests_total カウンタ指標と example_random_numbers ヒストグラム指標が出力されます。このサンプルのマニフェストでは 3 つのレプリカを定義しています。

サンプルアプリケーションをデプロイするには、次のコマンドを実行します。

kubectl -n NAMESPACE_NAME apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/prometheus-engine/v0.13.0/examples/example-app.yaml

コレクタ構成を ConfigMap として作成する

構成を作成して config.yaml というファイルに配置したら、そのファイルを使用して config.yaml ファイルに基づく Kubernetes ConfigMap を作成します。コレクタがデプロイされると、ConfigMap がマウントされ、ファイルが読み込まれます。

構成ファイルで otel-config という名前の ConfigMap を作成するには、次のコマンドを使用します。

kubectl -n NAMESPACE_NAME create configmap otel-config --from-file config.yaml

コレクタをデプロイする

次の内容の collector-deployment.yaml ファイルを作成します。

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: NAMESPACE_NAME:prometheus-test
rules:
- apiGroups: [""]
  resources:
  - pods
  verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: NAMESPACE_NAME:prometheus-test
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: NAMESPACE_NAME:prometheus-test
subjects:
- kind: ServiceAccount
  namespace: NAMESPACE_NAME
  name: default
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: otel-collector
spec:
  replicas: 1
  selector:
    matchLabels:
      app: otel-collector
  template:
    metadata:
      labels:
        app: otel-collector
    spec:
      containers:
      - name: otel-collector
        image: otel/opentelemetry-collector-contrib:0.106.0
        args:
        - --config
        - /etc/otel/config.yaml
        - --feature-gates=exporter.googlemanagedprometheus.intToDouble
        volumeMounts:
        - mountPath: /etc/otel/
          name: otel-config
      volumes:
      - name: otel-config
        configMap:
          name: otel-config

次のコマンドを実行して、Kubernetes クラスタに Collector の Deployment を作成します。

kubectl -n NAMESPACE_NAME create -f collector-deployment.yaml

Pod が開始すると、Pod はサンプルアプリケーションをスクレイピングして、Managed Service for Prometheus に指標を報告します。

データのクエリ方法については、Cloud Monitoring を使用したクエリまたは Grafana を使用したクエリをご覧ください。

認証情報を明示的に提供する

GKE で実行する場合、OpenTelemetry Collector は、ノードのサービスアカウントに基づいて環境から認証情報を自動的に取得します。GKE 以外の Kubernetes クラスタでは、フラグまたは GOOGLE_APPLICATION_CREDENTIALS 環境変数を使用して、認証情報を OpenTelemetry Collector に明示的に提供する必要があります。

コンテキストをターゲットプロジェクトに設定します。
```
gcloud config set project PROJECT_ID
```
サービスアカウントの作成:
```
gcloud iam service-accounts create gmp-test-sa
```
この手順では、Workload Identity Federation for GKE の手順ですでに作成したサービスアカウントを作成します。

サービスアカウントに必要な権限を付与します。

gcloud projects add-iam-policy-binding PROJECT_ID\
  --member=serviceAccount:gmp-test-sa@PROJECT_ID.iam.gserviceaccount.com \
  --role=roles/monitoring.metricWriter

サービスアカウントキーを作成してダウンロードします。

gcloud iam service-accounts keys create gmp-test-sa-key.json \
  --iam-account=gmp-test-sa@PROJECT_ID.iam.gserviceaccount.com

鍵ファイルを Secret として GKE 以外のクラスタに追加します。

kubectl -n NAMESPACE_NAME create secret generic gmp-test-sa \
  --from-file=key.json=gmp-test-sa-key.json

編集する OpenTelemetry Deployment リソースを開きます。
```
kubectl -n NAMESPACE_NAME edit deployment otel-collector
```

太字で示されているテキストをリソースに追加します。

apiVersion: apps/v1
kind: Deployment
metadata:
  namespace: NAMESPACE_NAME
  name: otel-collector
spec:
  template
    spec:
      containers:
      - name: otel-collector
        env:
        - name: "GOOGLE_APPLICATION_CREDENTIALS"
          value: "/gmp/key.json"
...
        volumeMounts:
        - name: gmp-sa
          mountPath: /gmp
          readOnly: true
...
      volumes:
      - name: gmp-sa
        secret:
          secretName: gmp-test-sa
...

ファイルを保存して、エディタを閉じます。変更が適用されると、Pod が再作成され、指定されたサービスアカウントで指標のバックエンドに対する認証が開始します。

Prometheus 指標をスクレイピングする

このセクションと次のセクションでは、OpenTelemetry Collector を使用するための追加のカスタマイズ情報を説明します。この情報は、特定の状況で役立つ場合がありますが、OpenTelemetry Collector の設定で説明されているサンプルを実行する必要はありません。

アプリケーションがすでに Prometheus エンドポイントを公開している場合、OpenTelemetry Collector は、標準の Prometheus 構成で使用するのと同じ scrape_config 形式を使用して、これらのエンドポイントをスクレイピングできます。これを行うには、コレクタ構成で Prometheus レシーバーを有効にします。

Kubernetes Pod の単純な Prometheus レシーバー構成は、次のようになります。

receivers:
  prometheus:
    config:
      scrape_configs:
      - job_name: 'kubernetes-pods'
        kubernetes_sd_configs:
        - role: pod
        relabel_configs:
        - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
          action: keep
          regex: true
        - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
          action: replace
          target_label: __metrics_path__
          regex: (.+)
        - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
          action: replace
          regex: (.+):(?:\d+);(\d+)
          replacement: $$1:$$2
          target_label: __address__
        - action: labelmap
          regex: __meta_kubernetes_pod_label_(.+)

service:
  pipelines:
    metrics:
      receivers: [prometheus]

これは、サービスディスカバリベースの単純な取得構成で、アプリケーションのスクレイピングに応じて変更できます。

OpenTelemetry コレクタの prometheus レシーバーで既存の Prometheus 構成を使用する場合は、$ 記号をすべて $$ to avoid triggering environment variable substitution. This is especially important to do for the replacement value within your relabel_configs section. For example, if you have the following relabel_config section:

- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
  action: replace
  regex: (.+):(?:\d+);(\d+)
  replacement: $1:$2
  target_label: __address__

Then rewrite it to be:

- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
  action: replace
  regex: (.+):(?:\d+);(\d+)
  replacement: $$1:$$2
  target_label: __address__


 に置き換えます。
For more information, see the OpenTelemetry documentation.


Next, we strongly recommend that you use processors to format your metrics. In
many cases, processors must be used to properly format your metrics.

Add processors

OpenTelemetry
processors 
modify telemetry data before it is exported. You can use the processors below
to ensure that your metrics are written in a format compatible with
Managed Service for Prometheus.

Detect resource attributes

The Managed Service for Prometheus exporter for OpenTelemetry uses the
prometheus_target monitored
resource
to uniquely identify time series data points. The exporter parses the required
monitored-resource fields from resource attributes on the metric data points.
The fields and the attributes from which the values are scraped are:


project_id: auto-detected by Application Default
Credentials,
gcp.project.id, or project in exporter config (see configuring the
exporter)
location: location, cloud.availability_zone, cloud.region
cluster: cluster, k8s.cluster_name
namespace: namespace, k8s.namespace_name
job: service.name + service.namespace
instance: service.instance.id


Failure to set these labels to unique values can result in "duplicate
timeseries" errors when exporting to Managed Service for Prometheus.
Note: The terms labels and attributes, when referring to metric data points,
  represent essentially the same concept in Prometheus and OpenTelemetry,
  respectively. In this context, a Prometheus metric with the label foo will
  be converted into an OpenTelemetry data point with an attribute foo. The
  specific labels/attributes listed above are converted into resource
  attributes, which are another OpenTelemetry concept for identifying data
  points specific to the source of the data. These resource attributes are then
  mapped to the monitored resource fields listed.
The Prometheus receiver automatically sets the service.name attribute
based on the job_name in the scrape config, and service.instance.id
attribute based on the scrape target's instance. The receiver also sets
k8s.namespace.name when using role: pod in the scrape config.

We recommend populating the other attributes automatically using the resource
detection
processor.
However, depending on your environment, some attributes might not be automatically
detectable. In this case, you can use other processors to either manually
insert these values or parse them from metric labels. The following sections
illustration configurations for doing this processing on various platforms

GKE

When running OpenTelemetry on GKE, you only need to enable the
resource-detection processor to fill out the resource labels. Be sure that your
metrics don't already contain any of the reserved resource labels. If this is
unavoidable, see Avoid resource attribute collisions by renaming
attributes.

processors:
  resourcedetection:
    detectors: [gcp]
    timeout: 10s


This section can be copied directly into your config file, replacing the
processors section if it already exists.

Amazon EKS

The EKS resource detector does not automatically fill in the cluster or
namespace attributes. You can provide these values manually by using
the resource
processor,
as shown in the following example:

processors:
  resourcedetection:
    detectors: [eks]
    timeout: 10s

  resource:
    attributes:
    - key: "cluster"
      value: "my-eks-cluster"
      action: upsert
    - key: "namespace"
      value: "my-app"
      action: upsert


You can also convert these values from metric labels using the groupbyattrs
processor (see move metric labels to resource labels below).

Azure AKS

The AKS resource detector does not automatically fill in the cluster or
namespace attributes. You can provide these values manually by using the
resource
processor,
as shown in the following example:

processors:
  resourcedetection:
    detectors: [aks]
    timeout: 10s

  resource:
    attributes:
    - key: "cluster"
      value: "my-eks-cluster"
      action: upsert
    - key: "namespace"
      value: "my-app"
      action: upsert


You can also convert these values from metric labels by using the groupbyattrs
processor; see Move metric labels to resource labels.

On-premises and non-cloud environments

With on-premises or non-cloud environments, you probably can't
detect any of the necessary resource attributes automatically. In this case, you
can emit these labels in your metrics and move them to resource attributes (see
Move metric labels to resource labels), or manually set all
of the resource attributes as shown in the following example:

processors:
  resource:
    attributes:
    - key: "cluster"
      value: "my-on-prem-cluster"
      action: upsert
    - key: "namespace"
      value: "my-app"
      action: upsert
    - key: "location"
      value: "us-east-1"
      action: upsert


Create your collector config as a ConfigMap describes how
to use the config. That section assumes you have put your config in a file
called config.yaml.

The project_id resource attribute can still be automatically set when running
the Collector with Application Default
Credentials.
If your Collector does not have access to Application Default Credentials, see
Setting project_id.

Alternatively, you can manually set the resource attributes you need in an
environment variable, OTEL_RESOURCE_ATTRIBUTES, with a comma-separated list of
key/value pairs, for example:

export OTEL_RESOURCE_ATTRIBUTES="cluster=my-cluster,namespace=my-app,location=us-east-1"


Then use the env resource detector
processor 
to set the resource attributes:

processors:
  resourcedetection:
    detectors: [env]


Avoid resource attribute collisions by renaming attributes

If your metrics already contain labels that collide with the required
resource attributes (such as location, cluster, or namespace), rename them
to avoid the collision. The Prometheus convention is to add the prefix exported_
to the label name. To add this prefix, use the transform
processor.

The following processors config renames any potential collisions and
resolves any conflicting keys from the metric:

processors:
  transform:
    # "location", "cluster", "namespace", "job", "instance", and "project_id" are reserved, and
    # metrics containing these labels will be rejected.  Prefix them with exported_ to prevent this.
    metric_statements:
    - context: datapoint
      statements:
      - set(attributes["exported_location"], attributes["location"])
      - delete_key(attributes, "location")
      - set(attributes["exported_cluster"], attributes["cluster"])
      - delete_key(attributes, "cluster")
      - set(attributes["exported_namespace"], attributes["namespace"])
      - delete_key(attributes, "namespace")
      - set(attributes["exported_job"], attributes["job"])
      - delete_key(attributes, "job")
      - set(attributes["exported_instance"], attributes["instance"])
      - delete_key(attributes, "instance")
      - set(attributes["exported_project_id"], attributes["project_id"])
      - delete_key(attributes, "project_id")


Move metric labels to resource labels

In some cases, your metrics might be intentionally reporting labels such as
namespace because your exporter is monitoring multiple namespaces. For
example, when running the
kube-state-metrics 
exporter.

In this scenario, these labels can be moved to resource attributes using the
groupbyattrs
processor:

processors:
  groupbyattrs:
    keys:
    - namespace
    - cluster
    - location


In the above example, given a metric with the labels namespace, cluster,
and/or location, those labels will be converted to the matching resource
attributes.

Limit API requests and memory usage

Two other processors, the batch
processor 
and memory limiter
processor 
allow you to limit the resource consumption of your collector.

Batch processing

Batching requests lets you define how many data points to send in a single
request. Note that Cloud Monitoring has a
limit of 200 time series per
request. Enable the batch processor by using the following settings:

processors:
  batch:
    # batch metrics before sending to reduce API usage
    send_batch_max_size: 200
    send_batch_size: 200
    timeout: 5s


Memory limiting

We recommend enabling the memory-limiter processor to prevent your collector
from crashing at times of high throughput. Enable the processing by using
the following settings:

processors:
  memory_limiter:
    # drop metrics if memory usage gets too high
    check_interval: 1s
    limit_percentage: 65
    spike_limit_percentage: 20


Configure the googlemanagedprometheus exporter

By default, using the googlemanagedprometheus exporter on GKE
requires no additional configuration. For many use cases you only need to enable
it with an empty block in the exporters section:

exporters:
  googlemanagedprometheus:


However, the exporter does provide some optional configuration settings. The
following sections describe the other configuration settings.

Setting project_id

To associate your time series with a Google Cloud project, the
prometheus_target monitored resource must have project_id set.

When running OpenTelemetry on Google Cloud, the
Managed Service for Prometheus exporter defaults to setting this value
based on the Application Default
Credentials
it finds. If no credentials are available, or you want to override the default
project, you have two options:


Set project in the exporter config
Add a gcp.project.id resource attribute to your metrics.


We strongly recommend using the default (unset) value for project_id rather
than explicitly setting it, when possible.
Note: When changing the project_id, the Collector's Service Account must have
  the roles/monitoring.metricWriter IAM role for the destination
  project.
Set project in the exporter config



The following config excerpt sends metrics to
Managed Service for Prometheus in the Google Cloud project MY_PROJECT:

receivers:
  prometheus:
    config:
    ...

processors:
  resourcedetection:
    detectors: [gcp]
    timeout: 10s

exporters:
  googlemanagedprometheus:
    project: MY_PROJECT

service:
  pipelines:
    metrics:
      receivers: [prometheus]
      processors: [resourcedetection]
      exporters: [googlemanagedprometheus]


The only change from previous examples is the new line project: MY_PROJECT.
This setting is useful if you know that every metric coming through this
Collector should be sent to MY_PROJECT.

Set gcp.project.id resource attribute

You can set project association on a per-metric basis by adding a
gcp.project.id resource attribute to your metrics. Set the value of the
attribute to the name of the project the metric should be associated with.

For example, if your metric already has a label project, this label can be
moved to a resource attribute and renamed to gcp.project.id by using
processors in the Collector config, as shown in the following example:

receivers:
  prometheus:
    config:
    ...

processors:
  resourcedetection:
    detectors: [gcp]
    timeout: 10s

  groupbyattrs:
    keys:
    - project

  resource:
    attributes:
    - key: "gcp.project.id"
      from_attribute: "project"
      action: upsert

exporters:
  googlemanagedprometheus:

service:
  pipelines:
    metrics:
      receivers: [prometheus]
      processors: [resourcedetection, groupbyattrs, resource]
      exporters: [googlemanagedprometheus]


Setting client options

The googlemanagedprometheus exporter uses gRPC clients for
Managed Service for Prometheus. Therefore, optional settings
are available for configuring the gRPC client:


compression: Enables gzip compression for gRPC requests, which is useful for
minimizing data transfer fees when sending data from other clouds to
Managed Service for Prometheus (valid values: gzip).
user_agent: Overrides the user-agent string sent on requests to
Cloud Monitoring; only applies to metrics.
Defaults to the build and version number of your OpenTelemetry Collector,
for example, opentelemetry-collector-contrib 0.106.0.
endpoint: Sets the endpoint to which metric data is going to be sent.
use_insecure: If true, uses gRPC as the communication transport. Has an
effect only when the endpoint value is not "".
grpc_pool_size: Sets the size of the connection pool in the gRPC client.
prefix: Configures the prefix of metrics sent to
Managed Service for Prometheus. Defaults to
prometheus.googleapis.com.
Don't change this prefix; doing so causes metrics to not be
queryable with PromQL in the Cloud Monitoring UI.


In most cases, you don't need to change these values from their
defaults. However, you can change them to accommodate special
circumstances.

All of these settings are set under a metric block in the
googlemanagedprometheus exporter section, as shown in the following example:

receivers:
  prometheus:
    config:
    ...

processors:
  resourcedetection:
    detectors: [gcp]
    timeout: 10s

exporters:
  googlemanagedprometheus:
    metric:
      compression: gzip
      user_agent: opentelemetry-collector-contrib 0.106.0
      endpoint: ""
      use_insecure: false
      grpc_pool_size: 1
      prefix: prometheus.googleapis.com

service:
  pipelines:
    metrics:
      receivers: [prometheus]
      processors: [resourcedetection]
      exporters: [googlemanagedprometheus]


What's next




Use PromQL in Cloud Monitoring to query Prometheus metrics.
Use Grafana to query Prometheus metrics.
Set up the OpenTelemetry Collector as a sidecar agent in Cloud Run.















































































































































The Cloud Monitoring Metrics Management page provides information
that can help you control the amount you spend on billable metrics
without affecting observability. The Metrics Management page reports the
following information:

  Ingestion volumes for both byte- and sample-based billing, across metric
    domains and for individual metrics.
  Data about labels and cardinality of metrics.
  Number of reads for each metric.
  Use of metrics in alerting policies and custom dashboards.
  Rate of metric-write errors.

You can also use the Metrics Management to 
exclude unneeded metrics,
eliminating the cost of ingesting them.






For more information about the Metrics Management page, see
View and manage metric usage.

OpenTelemetry Collector を使ってみる

始める前に

プロジェクトとツールを設定する

環境を構成する

名前空間を設定する

サービスアカウントの認証情報を確認する

Workload Identity Federation for GKE 用のサービスアカウントを構成する

サービスアカウントを作成してバインドする

サービスアカウントを承認する

Workload Identity Federation for GKE の構成をデバッグする

本番環境での Workload Identity Federation for GKE

OpenTelemetry Collector を設定する

Collector を構成してデプロイする

GKE で OpenTelemetry Collector を実行する

Run the OpenTelemetry Collector outside Google Cloud

サンプルアプリケーションをデプロイする

コレクタ構成を ConfigMap として作成する

コレクタをデプロイする

認証情報を明示的に提供する

Prometheus 指標をスクレイピングする

Add processors

Detect resource attributes

GKE

Amazon EKS

Azure AKS

On-premises and non-cloud environments

Avoid resource attribute collisions by renaming attributes

Move metric labels to resource labels

Limit API requests and memory usage

Batch processing

Memory limiting

Configure the `googlemanagedprometheus` exporter

Setting `project_id`

Set `project` in the exporter config

Set `gcp.project.id` resource attribute

Setting client options

What's next

OpenTelemetry Collector を使ってみる

始める前に

プロジェクトとツールを設定する

環境を構成する

名前空間を設定する

サービス アカウントの認証情報を確認する

Workload Identity Federation for GKE 用のサービス アカウントを構成する

サービス アカウントを作成してバインドする

サービス アカウントを承認する

Workload Identity Federation for GKE の構成をデバッグする

本番環境での Workload Identity Federation for GKE

OpenTelemetry Collector を設定する

Collector を構成してデプロイする

GKE で OpenTelemetry Collector を実行する

Run the OpenTelemetry Collector outside Google Cloud

サンプル アプリケーションをデプロイする

コレクタ構成を ConfigMap として作成する

コレクタをデプロイする

認証情報を明示的に提供する

Prometheus 指標をスクレイピングする

Add processors

Detect resource attributes

GKE

Amazon EKS

Azure AKS

On-premises and non-cloud environments

Avoid resource attribute collisions by renaming attributes

Move metric labels to resource labels

Limit API requests and memory usage

Batch processing

Memory limiting

Configure the googlemanagedprometheus exporter

Setting project_id

Set project in the exporter config

Set gcp.project.id resource attribute

Setting client options

What's next

サービスアカウントの認証情報を確認する

Workload Identity Federation for GKE 用のサービスアカウントを構成する

サービスアカウントを作成してバインドする

サービスアカウントを承認する

サンプルアプリケーションをデプロイする

Configure the `googlemanagedprometheus` exporter

Setting `project_id`

Set `project` in the exporter config

Set `gcp.project.id` resource attribute