이 문서에서는 Google Cloud Managed Service for Prometheus를 사용하여 vLLM에서 측정항목을 수집할 수 있도록 Google Kubernetes Engine 배포를 구성하는 방법을 설명합니다. 이 문서에서는 다음을 수행하는 방법을 보여줍니다.
측정항목을 보고하도록 vLLM을 설정합니다.
내보낸 측정항목을 수집하도록 Managed Service for Prometheus의 PodMonitoring 리소스를 구성합니다.
측정항목을 보도록 Cloud Monitoring의 대시보드에 액세스합니다.
이 안내는 관리형 컬렉션을 Managed Service for Prometheus와 함께 사용하는 경우에만 적용됩니다.
자체 배포 컬렉션을 사용하는 경우 vLLM 문서에서 설치 정보를 참조하세요.
이 안내는 예시로서 제공되며 대부분의 Kubernetes 환경에서 작동합니다.
제한적인 보안 또는 조직 정책으로 인해 애플리케이션 또는 내보내기 도구를 설치하는 데 문제가 있으면 지원을 위한 오픈소스 문서를 참조하는 것이 좋습니다.
vLLM에 대한 자세한 내용은 vLLM을 참조하세요.
Google Kubernetes Engine에서 vLLM을 설정하는 방법에 관한 자세한 내용은 GKE vLLM 가이드를 참고하세요.
기본 요건
Managed Service for Prometheus 및 관리형 컬렉션을 사용하여 vLLM에서 측정항목을 수집하려면 배포가 다음 요구사항을 충족해야 합니다.
클러스터가 Google Kubernetes Engine 버전 1.21.4-gke.300 이상을 실행 중이어야 합니다.
관리형 컬렉션이 사용 설정된 상태에서 Managed Service for Prometheus를 실행 중이어야 합니다. 자세한 내용은 관리형 컬렉션 시작하기를 참조하세요.
vLLM은 Prometheus 형식의 측정항목을 자동으로 노출하므로, 사용자가 이를 개별적으로 설치할 필요가 없습니다. vLLM이 예상 엔드포인트에서 측정항목을 내보내는지 확인하려면 다음을 수행하세요.
브라우저 또는 다른 터미널 세션의 curl 유틸리티를 사용하여 엔드포인트 localhost:8000/metrics에 액세스합니다.
PodMonitoring 리소스 정의
대상 검색을 위해 Managed Service for Prometheus 연산자에는 동일한 네임스페이스의 vLLM에 해당하는 PodMonitoring 리소스가 필요합니다.
다음 PodMonitoring 구성을 사용할 수 있습니다.
# Copyright 2025 Google LLC## Licensed under the Apache License, Version 2.0 (the "License");# you may not use this file except in compliance with the License.# You may obtain a copy of the License at## https://www.apache.org/licenses/LICENSE-2.0## Unless required by applicable law or agreed to in writing, software# distributed under the License is distributed on an "AS IS" BASIS,# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.# See the License for the specific language governing permissions and# limitations under the License.apiVersion:monitoring.googleapis.com/v1kind:PodMonitoringmetadata:name:vllmlabels:app.kubernetes.io/name:vllmapp.kubernetes.io/part-of:google-cloud-managed-prometheusspec:endpoints:-port:8000scheme:httpinterval:30spath:/metricsselector:matchLabels:app:vllm-gemma-server
port 및 matchLabels 필드의 값이 모니터링하려는 vLLM 포드의 값과 일치하는지 확인합니다.
[[["이해하기 쉬움","easyToUnderstand","thumb-up"],["문제가 해결됨","solvedMyProblem","thumb-up"],["기타","otherUp","thumb-up"]],[["이해하기 어려움","hardToUnderstand","thumb-down"],["잘못된 정보 또는 샘플 코드","incorrectInformationOrSampleCode","thumb-down"],["필요한 정보/샘플이 없음","missingTheInformationSamplesINeed","thumb-down"],["번역 문제","translationIssue","thumb-down"],["기타","otherDown","thumb-down"]],["최종 업데이트: 2025-07-15(UTC)"],[],[],null,["# vLLM\n\n\u003cbr /\u003e\n\nThis document describes how to configure your Google Kubernetes Engine deployment\nso that you can use Google Cloud Managed Service for Prometheus to collect metrics from\n\nvLLM. This document shows you how to do the following:\n\n- Set up vLLM to report metrics.\n- Configure a PodMonitoring resource for Managed Service for Prometheus to collect the exported metrics.\n- Access a dashboard in Cloud Monitoring to view the metrics.\n\n\u003cbr /\u003e\n\nThese instructions apply only if you are using [managed collection](/stackdriver/docs/managed-prometheus/setup-managed)\nwith Managed Service for Prometheus.\nIf you are using self-deployed collection, then see the\n\n[vLLM documentation](https://docs.vllm.ai/en/stable/serving/metrics.html)\n\nfor installation information.\n\nThese instructions are provided as an example and are expected to work in\nmost Kubernetes environments.\n\nIf you are having trouble installing an\napplication or exporter due to restrictive security or organizational policies,\nthen we recommend you consult open-source documentation for support.\n\nFor information about vLLM, see [vLLM](https://docs.vllm.ai/en/latest/).\n\nFor information about setting up vLLM on Google Kubernetes Engine,\nsee the GKE [guide for vLLM](/kubernetes-engine/docs/tutorials/serve-gemma-gpu-vllm).\n\nPrerequisites\n-------------\n\nTo collect metrics from\n\nvLLM\n\nby using\nManaged Service for Prometheus and managed collection, your deployment must\nmeet the following requirements:\n\n- Your cluster must be running Google Kubernetes Engine version 1.21.4-gke.300 or later.\n- You must be running Managed Service for Prometheus with managed collection enabled. For more information, see [Get started with managed collection](/stackdriver/docs/managed-prometheus/setup-managed).\n\n \u003cbr /\u003e\n\n\u003cbr /\u003e\n\nvLLM exposes Prometheus-format metrics automatically; you do not have to install it separately. To verify that vLLM is emitting metrics on the expected endpoints, do the following:\n\n\u003cbr /\u003e\n\n1. Set up port forwarding by using the following command: \n\n ```\n kubectl -n NAMESPACE_NAME port-forward POD_NAME 8000\n ```\n2. Access the endpoint `localhost:8000/metrics` by using the browser or the `curl` utility in another terminal session.\n\nDefine a PodMonitoring resource\n-------------------------------\n\nFor target discovery, the Managed Service for Prometheus Operator\nrequires a PodMonitoring resource that corresponds to vLLM\nin the same namespace.\n\nYou can use the following PodMonitoring configuration: \n\n # Copyright 2025 Google LLC\n #\n # Licensed under the Apache License, Version 2.0 (the \"License\");\n # you may not use this file except in compliance with the License.\n # You may obtain a copy of the License at\n #\n # https://www.apache.org/licenses/LICENSE-2.0\n #\n # Unless required by applicable law or agreed to in writing, software\n # distributed under the License is distributed on an \"AS IS\" BASIS,\n # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n # See the License for the specific language governing permissions and\n # limitations under the License.\n\n apiVersion: monitoring.googleapis.com/v1\n kind: PodMonitoring\n metadata:\n name: vllm\n labels:\n app.kubernetes.io/name: vllm\n app.kubernetes.io/part-of: google-cloud-managed-prometheus\n spec:\n endpoints:\n - port: 8000\n scheme: http\n interval: 30s\n path: /metrics\n selector:\n matchLabels:\n app: vllm-gemma-server\n\nEnsure that the values of the `port` and `matchLabels` fields match those of the vLLM pods you want to monitor.\n\nTo apply configuration changes from a local file, run the following command:\n\n```\nkubectl apply -n NAMESPACE_NAME -f FILE_NAME\n```\n\n\u003cbr /\u003e\n\nYou can also\n[use Terraform](/stackdriver/docs/managed-prometheus/setup-managed#terraform-scrape)\nto manage your configurations.\n\nVerify the configuration\n------------------------\n\nYou can use Metrics Explorer to verify that you correctly configured\nvLLM. It might take one or two minutes for\nCloud Monitoring to ingest your metrics.\n\nTo verify the metrics are ingested, do the following:\n\n1. In the Google Cloud console, go to the\n *leaderboard* **Metrics explorer** page:\n\n [Go to **Metrics explorer**](https://console.cloud.google.com/monitoring/metrics-explorer)\n\n \u003cbr /\u003e\n\n If you use the search bar to find this page, then select the result whose subheading is\n **Monitoring**.\n2. In the toolbar of the query-builder pane, select the button whose name is either *code* **MQL** or *code* **PromQL**.\n3. Verify that **PromQL** is selected in the **Language** toggle. The language toggle is in the same toolbar that lets you format your query.\n4. Enter and run the following query: \n\n ```\n up{job=\"vllm\", cluster=\"CLUSTER_NAME\", namespace=\"NAMESPACE_NAME\"}\n ```\n\n\u003cbr /\u003e\n\nView dashboards\n---------------\n\nThe Cloud Monitoring integration includes\n\nthe **vLLM Prometheus Overview** dashboard.\n\nDashboards are automatically installed when you configure the integration.\nYou can also view static previews of dashboards without installing the\nintegration.\n\n\nTo view an installed dashboard, do the following:\n\n1. In the Google Cloud console, go to the **Dashboards** page:\n\n [Go to **Dashboards**](https://console.cloud.google.com/monitoring/dashboards)\n\n \u003cbr /\u003e\n\n If you use the search bar to find this page, then select the result whose subheading is\n **Monitoring**.\n2. Select the **Dashboard List** tab.\n3. Choose the **Integrations** category.\n4. Click the name of the dashboard, for example, **vLLM Prometheus Overview**.\n\n\u003cbr /\u003e\n\nTo view a static preview of the dashboard, do the following:\n\n1. In the Google Cloud console, go to the\n **Integrations**\n page:\n\n [Go to **Integrations**](https://console.cloud.google.com/monitoring/integrations)\n\n \u003cbr /\u003e\n\n If you use the search bar to find this page, then select the result whose subheading is\n **Monitoring**.\n2. Click the **Kubernetes Engine** deployment-platform filter.\n3. Locate the vLLM integration and click **View Details**.\n4. Select the **Dashboards** tab.\n\n\u003cbr /\u003e\n\nTroubleshooting\n---------------\n\nFor information about troubleshooting metric ingestion problems, see\n[Problems with collection from exporters](/stackdriver/docs/managed-prometheus/troubleshooting#exporter-problems) in [Troubleshooting ingestion-side problems](/stackdriver/docs/managed-prometheus/troubleshooting#ingest-problems)."]]