Create alerting policies with a PromQL-based condition (API)

This page describes how to create an alerting policy with a PromQL-based condition by using the Cloud Monitoring API. You can use PromQL queries in your alerting policies to create complex conditions with features such as ratios, dynamic thresholds, and metric evaluation.

For general information, see Alerting policies with PromQL.

If you work in a Prometheus environment outside Cloud Monitoring and have Prometheus alerting rules, then you can use the Google Cloud CLI to migrate them to Monitoring alerting policies with a PromQL query. For more information, see Migrate alerting rules and receivers from Prometheus.

Create alerting policies with PromQL queries

You use the alertPolicies.create method to programmatically create alerting policies.

The only difference between creating alerting policies with PromQL-based conditions and other alerting policies is that your Condition type must be PrometheusQueryLanguageCondition. This condition type allows alerting policies to be defined with PromQL.

The following shows a PromQL query for an alerting policy condition that uses a metric from the kube-state exporter to find the number of times that a container has been restarted in the last 30 minutes:

rate(kube_pod_container_status_restarts[30m]) * 1800 > 1

Constructing the alerting policy

To build an alerting policy with a PromQL-based condition, use the AlertPolicy condition type PrometheusQueryLanguageCondition. The PrometheusQueryLanguageCondition has the following structure:

{
  "query": string,
  "duration": string,
  "evaluationInterval": string,
  "labels": {string: string},
  "ruleGroup": string,
  "alertRule": string
}

The PrometheusQueryLanguageCondition fields have the following definitions:

  • query: The PromQL expression to evaluate. Equivalent to the expr field from a standard Prometheus alerting rule.
  • duration: Specifies the length of time during which each evaluation of the query must generate a true value before the alerting policy is triggered. The value must be a number of minutes, expressed in seconds; for example, 600s for a 10-minute duration. For more information, see Behavior of metric-based alerting policies.
  • evaluationInterval: The interval of time, in seconds, between PromQL evaluations of the query. The default value is 30 seconds. If the PrometheusQueryLanguageCondition was created by migrating a Prometheus alerting rule, then this value comes from the Prometheus rule group that contained the Prometheus alerting rule.
  • labels: An optional way to add or overwrite labels in the PromQL expression result.
  • ruleGroup: If the alerting policy was migrated from a Prometheus configuration file, then this field contains the value of the name field from the rule group in the Prometheus configuration file. This field isn't required when you make a PromQL alerting policy in Cloud Monitoring API.
  • alertRule: If the alerting policy was migrated from a Prometheus configuration file, then this field contains the value of the alert field from the alerting rule in the Prometheus configuration file. This field isn't required when you make a PromQL alerting policy in Cloud Monitoring API.

For example, the following condition uses a PromQL query to find the number of times that a container has been restarted in the last 30 minutes:

"conditionPrometheusQueryLanguage": {
  "query": "rate(kube_pod_container_status_restarts[30m]) * 1800 > 1",
  "duration": "10s",
  "alertRule": "ContainerRestartCount",
  "labels": {
    "action_required":"true",
    "severity":"critical/warning/info"}
}

Use this structure as the value of a conditionPrometheusQueryLanguage field in a condition, which is in turn embedded in an alerting-policy structure. For more information about these structures, see AlertPolicy.

The following shows a complete policy with a PrometheusQueryLanguageCondition condition in JSON:

{
  "displayName": "Container Restarts",
  "documentation": {
    "content": "Pod ${resource.label.namespace_name}/${resource.label.pod_name} has restarted more than once during the last 30 minutes.",
    "mimeType": "text/markdown",
    "subject": "Container ${resource.label.container_name} in Pod ${resource.label.namespace_name}/${resource.label.pod_name} has restarted more than once during the last 30 minutes."
  },
  "userLabels": {},
  "conditions": [
    {
      "displayName": "Container has restarted",
      "conditionPrometheusQueryLanguage": {
        "query": "rate(kubernetes_io:container_restart_count[30m]) * 1800",
        "duration": "10s",
        "alertRule": "ContainerRestart",
        "labels": {
          "action_required":"true",
          "severity":"critical/warning/info"}
      }
    }
  ],
  "alertStrategy": {
    "autoClose": "1800s"
  },
  "combiner": "OR",
  "enabled": true
}

Create an alerting policy

To create the alerting policy, put the alerting policy JSON into a file called POLICY_NAME.json, and then run the following command:

curl -d @POLICY_NAME.json -H "Authorization: Bearer $TOKEN"
-H 'Content-Type: application/json'
-X POST https://monitoring.googleapis.com/v3/projects/${PROJECT}/alertPolicies

For more information about the Monitoring API for alerting policies, see Managing alerting policies by API.

For more information about using curl, see Invoking curl.

Using Terraform

For instructions on configuring PromQL-based alerting policies using Terraform, see the condition_prometheus_query_language section of the google_monitoring_alert_policy Terraform registry.

For general information about using Google Cloud with Terraform, see Terraform with Google Cloud.

Invoking curl

Each curl invocation includes a set of arguments, followed by the URL of an API resource. The common arguments include a Google Cloud project ID and an authentication token. These values are represented here by the PROJECT_ID and TOKEN environment variables.

You might also have to specify other arguments, for example, to specify the type of the HTTP request (for example, -X DELETE). The default request is GET, so the examples don't specify it.

Each curl invocation has this general structure:

curl --http1.1 --header "Authorization: Bearer ${TOKEN}" <other_args> https://monitoring.googleapis.com/v3/projects/${PROJECT_ID}/<request>

To use curl, you must specify your project ID and an access token. To reduce typing and errors, you can put these into environment variables as pass them to curl that way.

To set these variables, do the following:

  1. Create an environment variable to hold the ID of your scoping project of a metrics scope. These steps call the variable PROJECT_ID:

    PROJECT_ID=a-sample-project
    
  2. Authenticate to the Google Cloud CLI:

    gcloud auth login
    
  3. Optional. To avoid having to specify your project ID with each gcloud command, set your project ID as the default by using gcloud CLI:

    gcloud config set project ${PROJECT_ID}
    
  4. Create an authorization token and capture it in an environment variable. These steps call the variable TOKEN:

    TOKEN=`gcloud auth print-access-token`
    

    You have to periodically refresh the access token. If commands that worked suddenly report that you are unauthenticated, reissue this command.

  5. To verify that you got an access token, echo the TOKEN variable:

    echo ${TOKEN}
    ya29.GluiBj8o....