本頁面由 Cloud Translation API 翻譯而成。

從專案建立快訊規則

在 GDC 控制台上，建立包含專案指標或記錄快訊規則的群組。指標規則會根據指標資料傳送快訊，記錄規則則會根據記錄資料傳送快訊。您必須輸入查詢語言運算式，判斷快訊是否必須移至待處理狀態。此外，您也可以加入標籤和註解等選用值。

標籤可讓您以鍵/值組合對應的形式，區分快訊的特徵。使用標籤新增或覆寫資訊，例如嚴重程度 (錯誤、重大、警告或資訊)、快訊代碼，以及用於識別資源的簡短名稱。

另一方面，您可以使用註解，在快訊中新增非識別中繼資料。舉例來說，您可以加入使用者介面 (UI) 欄位中顯示的訊息和運算式值，或加入 Runbook URL，以利解決問題。

或者，您也可以使用 Observability API 建立快訊規則，直接與自訂資源互動，並更新專案命名空間中的變更。

事前準備

繼續操作前，請確認您已具備下列必要權限：

根據記錄建立快訊規則

如要取得建立或查看以記錄為依據的快訊規則所需權限，請要求專案 IAM 管理員在專案命名空間中授予您下列其中一個角色：

記錄規則建立工具：建立 LoggingRule 自訂資源。要求 Logging 規則建立者 (loggingrule-creator) 角色。
記錄規則編輯器：編輯或修改LoggingRule自訂資源。要求 Logging 規則編輯者 (loggingrule-editor) 角色。
記錄規則檢視者：檢視 LoggingRule 自訂資源。要求「記錄規則檢視者」(loggingrule-viewer) 角色。

根據指標建立快訊規則

如要取得建立或查看指標警報規則所需的權限，請要求專案 IAM 管理員在專案命名空間中授予您下列其中一個角色：

監控規則編輯器：編輯或修改MonitoringRule自訂資源。要求「Monitoring 規則編輯者」(monitoringrule-editor) 角色。
監控規則檢視者：可檢視 MonitoringRule 自訂資源。要求 Monitoring Rule Viewer (monitoringrule-viewer) 角色。

如要進一步瞭解角色指派作業，請參閱預先定義的角色說明。

建立規則

您可以使用GDC 控制台 (建議使用此方法) 建立快訊規則，也可以在專案命名空間中，使用 Observability API 部署自訂資源。

主控台

請按照下列步驟，透過 GDC 控制台建立快訊規則：

在 GDC 控制台中選取專案。
在導覽選單中，依序點選「作業」>「快訊」。
按一下「快訊政策」分頁標籤。
按一下「建立規則群組」。
選取要建立「指標」或「記錄」的群組。指標規則會根據系統監控資料傳送快訊，記錄規則則會根據系統記錄資料傳送快訊。
在「快訊規則群組名稱」欄位中，輸入群組名稱。
在「規則評估間隔」欄位中，輸入每個間隔的秒數。
在「限制」欄位中，輸入快訊數量上限。如要設定無限次數的快訊，請輸入 0。
在「快訊規則」部分中，按一下「建立快訊規則」。
輸入快訊規則名稱。
輸入快訊規則的運算式：
- 如果是系統記錄規則，請輸入 LogQL (記錄查詢語言) 運算式。
- 如果是系統監控規則，請輸入 PromQL (Prometheus 查詢語言) 運算式。
這個運算式必須評估為 true 或 false 陳述式，以判斷警示是否必須移至待處理狀態。

注意： 快訊規則條件首次發生時，快訊會移至待處理狀態。如果條件在下列時間內成立，快訊就會進入開啟狀態。此時，可觀測性系統會傳送快訊。
在「Duration」(時間長度) 欄位中輸入秒數，定義有效快訊從待處理狀態轉為開啟狀態的時間。注意：如果將時間長度設為 0，當符合條件時，可觀測性系統會立即傳送快訊。
在「嚴重程度」欄位中，選擇嚴重程度，例如「錯誤」或「警告」。
輸入簡短名稱來識別相關資源，例如 AIS 或 DHCP。
輸入快訊代碼，以便識別快訊。
輸入 Runbook 網址或資訊，協助解決問題。
輸入警告訊息或說明。
選用：按一下「新增標籤」，以鍵/值組合形式新增標籤。
選用：按一下「新增註解」，以鍵/值組合形式新增註解。
按一下 [儲存] 即可建立規則。
按一下「建立」即可建立規則群組。規則群組會顯示在「警報規則群組」清單中。

API

您可以在 GDC 中部署自訂資源，並使用 Observability API 建立系統監控和記錄規則。MonitoringRule 或 LoggingRule 自訂資源包含一或多個查詢和運算式，可形成條件、評估頻率，以及 (選用) 條件成立的持續時間。

請按照下列步驟，在專案命名空間中部署自訂資源，藉此建立快訊規則：

使用下列監控或記錄警報規則的範本，為自訂資源建立 YAML 檔案：
- 如要建立系統監控規則，並根據指標資料傳送快訊，請使用MonitoringRule 自訂資源。
- 如要建立系統記錄規則並根據記錄資料傳送快訊，請使用LoggingRule 自訂資源。
在自訂資源的 namespace 欄位中，輸入專案命名空間。
在「」name欄位中，輸入快訊規則設定的名稱。
選用：如果您要為記錄規則設定 LoggingRule 自訂資源，可以在 source 欄位中選擇警報的記錄來源。例如，輸入 operational 或 audit 等值。
在 interval 欄位中，輸入規則評估間隔的秒數。
選用：在「」limit欄位中，輸入快訊數量上限。如要設定無限次數的快訊，請輸入 0。
選用：如要計算指標及設定記錄規則，請在 recordRules 欄位中輸入下列資訊：
- 在 record 欄位中輸入錄音名稱。這個值會定義要寫入記錄規則的時間序列，且必須是有效的指標名稱。
- 在 expr 欄位中，輸入記錄規則的運算式：
  - 如果是系統記錄規則，請輸入 LogQL (記錄查詢語言) 運算式。
  - 如果是系統監控規則，請輸入 PromQL (Prometheus 查詢語言) 運算式。
  這個運算式必須解析為數值，才能記錄為新指標。
- 選用：在 labels 欄位中，以鍵/值組合的形式定義要新增或覆寫的標籤。
在 alertRules 欄位中輸入下列資訊，設定快訊規則：
- 在 alert 欄位中輸入快訊名稱。
- 在 expr 欄位中，輸入快訊規則的運算式：
  - 如果是系統記錄規則，請輸入 LogQL 運算式。
  - 如果是系統監控規則，請輸入 PromQL 運算式。
  這個運算式必須評估為 true 或 false 陳述式，以判斷警示是否必須移至待處理狀態。
- 選用：在 for 欄位中，輸入指定條件必須符合的時間長度 (以秒為單位)，警報才會從待處理狀態移至開啟狀態。如未指定其他值，預設時間長度為 0 秒。
  
  注意： 快訊規則條件首次發生時，快訊會移至待處理狀態。如果條件在下列時間內成立，快訊就會進入開啟狀態。此時，可觀測性系統會傳送快訊。此外，如果將時間長度設為 0，當條件符合時，可觀測性系統會立即傳送快訊。
- 在 labels 欄位中，以鍵/值組合的形式定義要新增或覆寫的標籤。必須提供下列標籤：
  - severity：選擇嚴重程度，例如 error、critical、warning 或 info。
  - code：輸入快訊代碼，以識別快訊。
  - resource：輸入簡短名稱來識別相關資源，例如 AIS 或 DHCP。
- 選用：在 annotations 欄位中，以鍵/值組合形式新增註解。
儲存自訂資源的 YAML 檔案。
在管理員叢集的專案命名空間中部署自訂資源，即可建立警報規則。

透過自訂資源設定系統記錄和監控規則

本節包含 YAML 範本，您必須使用這些範本部署自訂資源，才能建立快訊規則。如果您是透過 GDC 控制台建立快訊，可以略過這個部分。

`MonitoringRule` 自訂資源

如要建立系統監控規則，請建立 MonitoringRule 自訂資源。MonitoringRule 包含記錄規則和快訊規則，用來描述傳送快訊的條件。

下列 YAML 檔案顯示 MonitoringRule 自訂資源的範本：

# Configures either an alert or a target record for precomputation
apiVersion: monitoring.gdc.goog/v1
kind: MonitoringRule
metadata:
  # Choose namespace that matches the project namespace
  # Note: The alert or record will be produced in the same namespace
  namespace: PROJECT_NAMESPACE
  name: alerting-config
spec:
  # Rule evaluation interval
  interval: 60s

  # Configure limit for number of alerts (0: no limit)
  # Optional. Default: 0 (no limit)
  limit: 0

  # Configure recording rules to generate new metrics based on pre-existing metrics.
  # Recording rules precompute expressions that are frequently needed or computationally expensive.
  # These rules save their result as a new set of time series.
  recordRules:
    # Define which timeseries to write to. The value must be a valid metric name.
  - record: MyMetricsName

    # Define PromQL expression to evaluate for this rule
    expr: rate({service_name="bob-service"} [1m])

    # Define labels to add or overwrite
    # Optional. Map of key-value pairs
    labels:
      <label_key>: <label_value>

  # Configure alert rules
  alertRules:
    # Define alert name 
  - alert: <string>

    # Define PromQL expression to evaluate for this rule
    # https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/
    expr: rate({service_name="bob-service"} [1m])

    # Define when an active alert moves from pending to open
    # Optional. Default: 0s
    for: 0s

    # Define labels to add or overwrite
    # Required, Map of key-value pairs
    # Required labels:
    #     severity: [error, critical, warning, info]
    #     code:
    #     resource: component/service/hardware related to the alert
    #     additional labels are optional
    labels:
      severity: error
      code: 202
      resource: AIS
      <label_key>: <label_value>

    # Define annotations to add
    # Optional. Map of key-value pairs
    # Recommended annotations:
    #     message: value of Message field in UI
    #     expression: value of Rule field in UI
    #     runbookurl: URL for link in Actions to take field in UI
    annotations:
      <label_key>: <label_value>

將 PROJECT_NAMESPACE 替換為專案的命名空間。

`LoggingRule` 自訂資源

如要建立系統記錄規則，必須建立 LoggingRule 自訂資源。LoggingRule 包含記錄規則和快訊規則，用來描述傳送快訊的條件。

# Configures either an alert or a target record for precomputation
apiVersion: logging.gdc.goog/v1
kind: LoggingRule
metadata:
  # Choose namespace that matches the project namespace
  # Note: The alert or record will be produced in the same namespace
  namespace: PROJECT_NAMESPACE
  name: alerting-config
spec:
  # Choose which log source to base alerts on (Operational/Audit Logs)
  # Optional. Default: Operational
  source: operational

  # Rule evaluation interval
  interval: 60s

  # Configure limit for number of alerts (0: no limit)
  # Optional. Default: 0 (no limit)
  limit: 0

  # Configure recording rules to generate new metrics based on pre-existing logs.
  # Recording rules generate metrics based on logs.
  # Use recording rules for complex alerts, which query the same expression repeatedly every time they are evaluated.
  recordRules:
    # Define which timeseries to write to. The value must be a valid metric name.
  - record: MyMetricsName

    # Define LogQL expression to evaluate for this rule
    # https://grafana.com/docs/loki/latest/rules/
    expr: rate({service_name="bob-service"} [1m])

    # Define labels to add or overwrite
    # Optional. Map of key-value pairs
    labels:
      <label_key>: <label_value>

  # Configure alert rules
  alertRules:
    # Define alert name
  - alert: <string>

    # Define LogQL expression to evaluate for this rule
    expr: rate({service_name="bob-service"} [1m])

    # Define when an active alert moves from pending to open
    # Optional. Default: 0s
    for: 0s

    # Define labels to add or overwrite
    # Required, Map of key-value pairs
    # Required labels:
    #     severity: [error, critical, warning, info]
    #     code:
    #     resource: component/service/hardware related to alert
    #     additional labels are optional
    labels:
      severity: warning
      code: 202
      resource: AIS
      <label_name>: <label_value>

    # Define annotations to add
    # Optional. Map of key-value pairs
    # Recommended annotations:
    #     message: value of Message field in UI
    #     expression: value of Rule field in UI
    #     runbookurl: URL for link in Actions to take field in UI
    annotations:
      <label_name>: <label_value>