对于目标发现,Managed Service for Prometheus Operator 需要与同一命名空间中的 etcd 导出器对应的 PodMonitoring 资源。
您可以使用以下 PodMonitoring 配置:
# Copyright 2023 Google LLC## Licensed under the Apache License, Version 2.0 (the "License");# you may not use this file except in compliance with the License.# You may obtain a copy of the License at## https://www.apache.org/licenses/LICENSE-2.0## Unless required by applicable law or agreed to in writing, software# distributed under the License is distributed on an "AS IS" BASIS,# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.# See the License for the specific language governing permissions and# limitations under the License.apiVersion:monitoring.googleapis.com/v1kind:PodMonitoringmetadata:name:etcdlabels:app.kubernetes.io/name:etcdapp.kubernetes.io/part-of:google-cloud-managed-prometheusspec:endpoints:-port:2379scheme:httpinterval:30spath:/metricsselector:matchLabels:app.kubernetes.io/name:etcd
确保 port 和 matchLabels 字段的值与要监控的 etcd pod 的值相匹配。使用此 helm 图表部署 etcd 将包含 app.kubernetes.io/name: etcd 标签和 client 端口。
# Copyright 2023 Google LLC## Licensed under the Apache License, Version 2.0 (the "License");# you may not use this file except in compliance with the License.# You may obtain a copy of the License at## https://www.apache.org/licenses/LICENSE-2.0## Unless required by applicable law or agreed to in writing, software# distributed under the License is distributed on an "AS IS" BASIS,# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.# See the License for the specific language governing permissions and# limitations under the License.apiVersion:monitoring.googleapis.com/v1kind:Rulesmetadata:name:etcd-ruleslabels:app.kubernetes.io/component:rulesapp.kubernetes.io/name:etcd-rulesapp.kubernetes.io/part-of:google-cloud-managed-prometheusspec:groups:-name:etcdinterval:30srules:-alert:EtcdLongFsyncDurationannotations:description:|-Etcd long fsync durationVALUE = {{ $value }}LABELS: {{ $labels }}summary:Etcd long fsync duration (instance {{ $labels.instance }})expr:histogram_quantile(0.9, rate(etcd_disk_wal_fsync_duration_seconds_bucket[10m])) > 0.1for:5mlabels:severity:critical-alert:EtcdRapidLeaderChangesannotations:description:|-Etcd rapid leader changesVALUE = {{ $value }}LABELS: {{ $labels }}summary:Etcd rapid leader changes (instance {{ $labels.instance }})expr:etcd_server_leader_changes_seen_total >= 0.05for:5mlabels:severity:critical
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-07-31。"],[],[],null,["\u003cbr /\u003e\n\nThis document describes how to configure your Google Kubernetes Engine deployment\nso that you can use Google Cloud Managed Service for Prometheus to collect metrics from\n\netcd. This document shows you how to do the following:\n\n- Set up etcd to report metrics.\n- Access a dashboard in Cloud Monitoring to view the metrics.\n- Configure alerting rules to monitor the metrics.\n\n\u003cbr /\u003e\n\nThese instructions apply only if you are using [managed collection](/stackdriver/docs/managed-prometheus/setup-managed)\nwith Managed Service for Prometheus.\nIf you are using self-deployed collection, then see the\n\n[etcd documentation](https://etcd.io/docs/v3.1/op-guide/monitoring/)\n\nfor installation information.\n\nThese instructions are provided as an example and are expected to work in\nmost Kubernetes environments.\n\nIf you are having trouble installing an\napplication or exporter due to restrictive security or organizational policies,\nthen we recommend you consult open-source documentation for support.\n\nFor information about etcd, see [etcd](https://etcd.io/).\n\nPrerequisites\n\nTo collect metrics from\n\nthe etcd exporter\n\nby using\nManaged Service for Prometheus and managed collection, your deployment must\nmeet the following requirements:\n\n- Your cluster must be running Google Kubernetes Engine version 1.28.15-gke.2475000 or later.\n- You must be running Managed Service for Prometheus with managed collection enabled. For more information, see [Get started with managed collection](/stackdriver/docs/managed-prometheus/setup-managed).\n\n \u003cbr /\u003e\n\n\u003cbr /\u003e\n\netcd exposes Prometheus-format metrics automatically; you do not have to install it separately. To verify that the etcd exporter is emitting metrics on the expected endpoints, do the following:\n\n\u003cbr /\u003e\n\n1. Set up port forwarding by using the following command:\n\n ```\n kubectl -n NAMESPACE_NAME port-forward POD_NAME 2379\n ```\n2. Access the endpoint `localhost:2379/metrics` by using the browser\n or the `curl` utility in another terminal session.\n\n| **Note:** these instructions are only applicable for scraping metrics from a self-deployed instance of etcd. Collecting metrics from the etcd instance that runs within the GKE control plane is not supported.\n\nDefine a PodMonitoring resource\n\nFor target discovery, the Managed Service for Prometheus Operator\nrequires a PodMonitoring resource that corresponds to\nthe etcd exporter in the same namespace.\n\nYou can use the following PodMonitoring configuration: \n\n # Copyright 2023 Google LLC\n #\n # Licensed under the Apache License, Version 2.0 (the \"License\");\n # you may not use this file except in compliance with the License.\n # You may obtain a copy of the License at\n #\n # https://www.apache.org/licenses/LICENSE-2.0\n #\n # Unless required by applicable law or agreed to in writing, software\n # distributed under the License is distributed on an \"AS IS\" BASIS,\n # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n # See the License for the specific language governing permissions and\n # limitations under the License.\n\n apiVersion: monitoring.googleapis.com/v1\n kind: PodMonitoring\n metadata:\n name: etcd\n labels:\n app.kubernetes.io/name: etcd\n app.kubernetes.io/part-of: google-cloud-managed-prometheus\n spec:\n endpoints:\n - port: 2379\n scheme: http\n interval: 30s\n path: /metrics\n selector:\n matchLabels:\n app.kubernetes.io/name: etcd\n\nEnsure that the values of the `port` and `matchLabels` fields match those of the etcd pods you want to monitor. Deploying etcd using this [helm chart](https://artifacthub.io/packages/helm/bitnami/etcd) will contain the label `app.kubernetes.io/name: etcd` and `client` port.\n\nTo apply configuration changes from a local file, run the following command:\n\n```\nkubectl apply -n NAMESPACE_NAME -f FILE_NAME\n```\n\n\u003cbr /\u003e\n\nYou can also\n[use Terraform](/stackdriver/docs/managed-prometheus/setup-managed#terraform-scrape)\nto manage your configurations.\n\nDefine rules and alerts\n\nYou can use the following `Rules` configuration to define\nalerts on your etcd metrics: \n\n # Copyright 2023 Google LLC\n #\n # Licensed under the Apache License, Version 2.0 (the \"License\");\n # you may not use this file except in compliance with the License.\n # You may obtain a copy of the License at\n #\n # https://www.apache.org/licenses/LICENSE-2.0\n #\n # Unless required by applicable law or agreed to in writing, software\n # distributed under the License is distributed on an \"AS IS\" BASIS,\n # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n # See the License for the specific language governing permissions and\n # limitations under the License.\n\n apiVersion: monitoring.googleapis.com/v1\n kind: Rules\n metadata:\n name: etcd-rules\n labels:\n app.kubernetes.io/component: rules\n app.kubernetes.io/name: etcd-rules\n app.kubernetes.io/part-of: google-cloud-managed-prometheus\n spec:\n groups:\n - name: etcd\n interval: 30s\n rules:\n - alert: EtcdLongFsyncDuration\n annotations:\n description: |-\n Etcd long fsync duration\n VALUE = {{ $value }}\n LABELS: {{ $labels }}\n summary: Etcd long fsync duration (instance {{ $labels.instance }})\n expr: histogram_quantile(0.9, rate(etcd_disk_wal_fsync_duration_seconds_bucket[10m])) \u003e 0.1\n for: 5m\n labels:\n severity: critical\n - alert: EtcdRapidLeaderChanges\n annotations:\n description: |-\n Etcd rapid leader changes\n VALUE = {{ $value }}\n LABELS: {{ $labels }}\n summary: Etcd rapid leader changes (instance {{ $labels.instance }})\n expr: etcd_server_leader_changes_seen_total \u003e= 0.05\n for: 5m\n labels:\n severity: critical\n\nTo apply configuration changes from a local file, run the following command:\n\n```\nkubectl apply -n NAMESPACE_NAME -f FILE_NAME\n```\n\n\u003cbr /\u003e\n\nYou can also\n[use Terraform](/stackdriver/docs/managed-prometheus/setup-managed#terraform-scrape)\nto manage your configurations.\n\nFor more information about applying rules to your cluster, see\n[Managed rule evaluation and alerting](/stackdriver/docs/managed-prometheus/rules-managed).\nYou can adjust the alert thresholds to suit your application.\n\nVerify the configuration\n\nYou can use Metrics Explorer to verify that you correctly configured\nthe etcd exporter. It might take one or two minutes for\nCloud Monitoring to ingest your metrics.\n\nTo verify the metrics are ingested, do the following:\n\n1. In the Google Cloud console, go to the\n *leaderboard* **Metrics explorer** page:\n\n [Go to **Metrics explorer**](https://console.cloud.google.com/monitoring/metrics-explorer)\n\n \u003cbr /\u003e\n\n If you use the search bar to find this page, then select the result whose subheading is\n **Monitoring**.\n2. In the toolbar of the query-builder pane, select the button whose name is either *code* **MQL** or *code* **PromQL**.\n3. Verify that **PromQL** is selected in the **Language** toggle. The language toggle is in the same toolbar that lets you format your query.\n4. Enter and run the following query: \n\n ```\n up{job=\"etcd\", cluster=\"CLUSTER_NAME\", namespace=\"NAMESPACE_NAME\"}\n ```\n\n\u003cbr /\u003e\n\nView dashboards\n\nThe Cloud Monitoring integration includes\n\nthe **etcd Prometheus Overview** dashboard.\n\nDashboards are automatically installed when you configure the integration.\nYou can also view static previews of dashboards without installing the\nintegration.\n\n\nTo view an installed dashboard, do the following:\n\n1. In the Google Cloud console, go to the **Dashboards** page:\n\n [Go to **Dashboards**](https://console.cloud.google.com/monitoring/dashboards)\n\n \u003cbr /\u003e\n\n If you use the search bar to find this page, then select the result whose subheading is\n **Monitoring**.\n2. Select the **Dashboard List** tab.\n3. Choose the **Integrations** category.\n4. Click the name of the dashboard, for example, **etcd Prometheus Overview**.\n\n\u003cbr /\u003e\n\nTo view a static preview of the dashboard, do the following:\n\n1. In the Google Cloud console, go to the\n **Integrations**\n page:\n\n [Go to **Integrations**](https://console.cloud.google.com/monitoring/integrations)\n\n \u003cbr /\u003e\n\n If you use the search bar to find this page, then select the result whose subheading is\n **Monitoring**.\n2. Click the **Kubernetes Engine** deployment-platform filter.\n3. Locate the etcd integration and click **View Details**.\n4. Select the **Dashboards** tab.\n\n\u003cbr /\u003e\n\nTroubleshooting\n\nFor information about troubleshooting metric ingestion problems, see\n[Problems with collection from exporters](/stackdriver/docs/managed-prometheus/troubleshooting#exporter-problems) in [Troubleshooting ingestion-side problems](/stackdriver/docs/managed-prometheus/troubleshooting#ingest-problems)."]]