Evaluasi aturan dan pemberitahuan dengan koleksi yang di-deploy sendiri
Tetap teratur dengan koleksi
Simpan dan kategorikan konten berdasarkan preferensi Anda.
Dokumen ini menjelaskan konfigurasi untuk evaluasi aturan dan pemberitahuan
dalam deployment Managed Service for Prometheus yang menggunakan
koleksi yang di-deploy sendiri.
Diagram berikut mengilustrasikan deployment yang menggunakan beberapa cluster
dalam dua Google Cloud project dan menggunakan evaluasi aturan dan pemberitahuan:
Untuk menyiapkan dan menggunakan deployment seperti yang ada dalam diagram, perhatikan hal berikut:
Aturan diinstal dalam setiap server koleksi Managed Service for Prometheus, seperti saat menggunakan Prometheus standar. Evaluasi aturan
dijalankan terhadap data yang disimpan secara lokal di setiap server. Server
dikonfigurasi untuk menyimpan data cukup lama untuk mencakup periode lihat balik semua
aturan, yang biasanya tidak lebih dari 1 jam. Hasil aturan ditulis ke Monarch setelah evaluasi.
Instance Prometheus AlertManager di-deploy secara manual di setiap
cluster. Server Prometheus dikonfigurasi dengan mengedit
kolom alertmanager_config dari file konfigurasi untuk mengirim
aturan pemberitahuan yang diaktifkan ke instance AlertManager lokal. Diam,
respons, dan alur kerja pengelolaan insiden biasanya
ditangani di alat pihak ketiga seperti PagerDuty.
Anda dapat memusatkan pengelolaan pemberitahuan di beberapa cluster ke dalam satu AlertManager menggunakan resource Endpoint Kubernetes.
Satu cluster yang berjalan di dalam Google Cloud ditetapkan sebagai
cluster evaluasi aturan global untuk cakupan metrik. Evaluator aturan mandiri di-deploy di cluster tersebut dan aturan diinstal menggunakan format file aturan Prometheus standar.
Evaluator aturan mandiri dikonfigurasi untuk menggunakan scoping_project_A, yang berisi Project 1 dan 2. Aturan yang dieksekusi terhadap scoping_project_A
otomatis diperluas ke Project 1 dan 2. Akun layanan pokok harus diberi izin Monitoring Viewer untuk scoping_project_A.
Menggunakan evaluator aturan global yang di-deploy sendiri dapat memiliki efek yang tidak terduga, bergantung pada apakah Anda mempertahankan atau menggabungkan label project_id, location, cluster, dan namespace dalam aturan:
Jika aturan Anda mempertahankan label project_id (dengan menggunakan
klausa by(project_id)), hasil aturan akan ditulis kembali ke
Monarch menggunakan nilai project_id asli dari deret waktu
yang mendasarinya.
Dalam skenario ini, Anda harus memastikan akun layanan yang mendasarinya memiliki izin Monitoring Metric Writer untuk setiap project yang dipantau di scoping_project_A. Jika menambahkan project baru yang dipantau ke scoping_project_A, Anda juga harus menambahkan izin baru ke akun layanan secara manual.
Jika aturan Anda tidak mempertahankan label project_id (dengan tidak menggunakan
klausa by(project_id)), hasil aturan akan ditulis kembali ke
Monarch menggunakan nilai project_id cluster tempat evaluator aturan global berjalan.
Dalam skenario ini, Anda tidak perlu mengubah akun layanan yang mendasarinya lebih lanjut.
Jika aturan Anda mempertahankan label location (dengan menggunakan klausa by(location)), hasil aturan akan ditulis kembali ke Monarch menggunakan setiap region Google Cloud asli tempat deret waktu yang mendasarinya berasal.
Jika aturan Anda tidak mempertahankan label location, data akan ditulis
kembali ke lokasi cluster tempat evaluator aturan global
berjalan.
Sebaiknya pertahankan label cluster dan namespace
dalam hasil evaluasi aturan jika memungkinkan. Jika tidak, performa kueri
mungkin menurun dan Anda mungkin mengalami batas kardinalitas. Sebaiknya jangan menghapus kedua label tersebut.
[[["Mudah dipahami","easyToUnderstand","thumb-up"],["Memecahkan masalah saya","solvedMyProblem","thumb-up"],["Lainnya","otherUp","thumb-up"]],[["Sulit dipahami","hardToUnderstand","thumb-down"],["Informasi atau kode contoh salah","incorrectInformationOrSampleCode","thumb-down"],["Informasi/contoh yang saya butuhkan tidak ada","missingTheInformationSamplesINeed","thumb-down"],["Masalah terjemahan","translationIssue","thumb-down"],["Lainnya","otherDown","thumb-down"]],["Terakhir diperbarui pada 2025-09-04 UTC."],[],[],null,["# Evaluation of rules and alerts with self-deployed collection\n\nThis document describes a configuration for rule and alert evaluation\nin a Managed Service for Prometheus deployment that uses\n[self-deployed collection](/stackdriver/docs/managed-prometheus/setup-unmanaged).\n\nThe following diagram illustrates a deployment that uses multiple clusters\nin two Google Cloud projects and uses both rule and alert evaluation:\n\nTo set up and use a deployment like the one in the diagram, note the\nfollowing:\n\n- Rules are installed within each Managed Service for Prometheus collection\n server, just as they are when using standard Prometheus. Rule evaluation\n executes against the data stored locally on each server. Servers are\n configured to retain data long enough to cover the lookback period of all\n rules, which is typically no more than 1 hour. Rule results are written\n to Monarch after evaluation.\n\n- A Prometheus AlertManager instance is manually deployed in every single\n cluster. Prometheus servers are configured by [editing the\n `alertmanager_config` field of the configuration file](/stackdriver/docs/managed-prometheus/rules-unmanaged#eval-rules-unmanaged) to send\n fired alerting rules to their local AlertManager instance. Silences,\n acknowledgements, and incident management workflows are typically\n handled in a third-party tool such as PagerDuty.\n\n You can centralize alert management across multiple clusters into a\n single AlertManager by using a Kubernetes [Endpoints resource](/stackdriver/docs/managed-prometheus/rules-unmanaged#eval-rules-unmanaged).\n- One single cluster running inside Google Cloud is designated as the\n global rule evaluation cluster for a metrics scope. The [standalone\n rule evaluator](/stackdriver/docs/managed-prometheus/rules-unmanaged#eval-rules-unmanaged) is deployed in that cluster and rules are\n installed using the standard Prometheus rule-file format.\n\n The standalone rule evaluator is configured to use scoping_project_A,\n which contains Projects 1 and 2. Rules executed against scoping_project_A\n automatically fan out to Projects 1 and 2. The underlying service account\n must be given the [Monitoring Viewer](/monitoring/access-control#mon_roles_desc) permissions\n for scoping_project_A.\n\n The rule evaluator is configured to send alerts to the local Prometheus\n Alertmanager by using the [`alertmanager_config` field of the configuration\n file](/stackdriver/docs/managed-prometheus/rules-unmanaged#eval-rules-unmanaged).\n\nUsing a self-deployed global rule evaluator may have unexpected\neffects, depending on whether you preserve or aggregate the `project_id`,\n`location`, `cluster`, and `namespace` labels in your rules:\n\n- If your rules preserve the `project_id` label (by using\n a `by(project_id)` clause), then rule results are written back to\n Monarch using the original `project_id` value of the underlying\n time series.\n\n In this scenario, you need to ensure the underlying service account\n has the [Monitoring Metric Writer](/monitoring/access-control#mon_roles_desc) permissions for each\n monitored project in scoping_project_A. If you add a new\n monitored project to scoping_project_A, then you must also manually\n add a new permission to the service account.\n- If your rules do not preserve the `project_id` label (by not using\n a `by(project_id)` clause), then rule results are written back to\n Monarch using the `project_id` value of the cluster where the\n global rule evaluator is running.\n\n In this scenario, you do not need to further modify the underlying\n service account.\n- If your rules preserve the `location` label (by using a `by(location)`\n clause), then rule results are written back to Monarch\n using each original Google Cloud region from which the underlying\n time series originated.\n\n If your rules do not preserve the `location` label, then data is written\n back to the location of the cluster where the global rule evaluator\n is running.\n\nWe strongly recommend preserving the `cluster` and `namespace` labels\nin rule evaluation results whenever possible. Otherwise, query performance\nmight decline and you might encounter cardinality limits. Removing both\nlabels is strongly discouraged."]]