Monitor your SQL query results with an alerting policy

This document explains how to create an alerting policy to monitor the results of a SQL query. The alerting policy notifies you when the query result satisfies conditions that you specify. For example, you could configure an alerting policy so that you're notified when at least 25% of the log entries in a certain period have a severity of ERROR.

There are three different approaches that you can use to get notified when content or patterns appear in your log data:

To scan individual log entries for a specific phrase, create a log-based alerting policy. Use these alerting policies when you want to be notified about things like security-related events.
To monitor events in your log entry data, you can create a log-based metric and then create an alerting policy to monitor the metric. These types of alerting policies are effective when you want to monitor trends in log entry data over time. However, they aren't as effective if you expect only a few events.
To perform an aggregate analysis of your log entry data and to then monitor the results, write a SQL query in Log Analytics to analyze your log entry data. Then, create an alerting policy to monitor the table of query results. This type of alerting policy is called a SQL-based alerting policy.

SQL-based alerting policies are most effective for evaluating exact values over multiple log entries. If you want to evaluate individual log entries and don't need to monitor exact values, then create log-based alerting policies that use the Logging query language.

The remainder of this document describes how to use SQL-based alerting policies.

For general information about Log Analytics, see Query and view logs overview.

How alerting policies work

An alerting policy describes the circumstances under which you want to be alerted and how you want to be notified about an incident. An alerting policy can monitor the results of a SQL query. When the query result meets the condition of the alerting policy, Cloud Monitoring creates an incident and then sends notifications about the incident through notification channels.

An incident is a record of the data that caused the condition to be met along with other relevant information. This information can help you troubleshoot the issues that caused the incident. You can view the incident by using the Google Cloud console. For more information, see Incidents for SQL-based alerting policies.

Alerting policy components

A SQL-based alerting policy contains a condition and a schedule:

The condition contains the query, which is a SQL query that queries a log view. The condition also defines the circumstances under which the query result causes Monitoring to create an incident.
The schedule defines how frequently the alerting policy runs its query. The schedule also defines the size of the lookback window, which is a filter that selects only those log entries that have been received since the previous time the query was evaluated. For example, if you set the schedule to 60 minutes, then the query is run every 60 minutes using a lookback window that selects the most-recent 60 minutes of log entries.

Evaluation types for SQL alerting policies

Conditions that monitor a SQL result support two types of evaluation:

Row count threshold: The condition is met when the number of rows in the query result is greater than, equal to, or less than a threshold value.

For example, suppose you want to get notified when more than 50 log entries in the lookback window have a severity greater than 200. You create a query that reports log entries whose severity is greater than 200. You then configure a condition, select the Row count threshold, and set the threshold to 50.
Boolean: The condition is met when a specific boolean column in the query result table contains any row with a value of true.

For example, suppose you want to get notified when more than 25% of the log entries in the lookback window have a severity of ERROR. You create a query that computes the percentage of log entries whose severity level is ERROR. The query results writes true to the notify column when that percentage exceeds 25%. Next, you create a condition, set the type to Boolean, and configure the condition to monitor the notify column.

Alerting policies that monitor a SQL query result must have only one condition.

Alerting policies and BigQuery

When an alerting policy runs a SQL query, that query is run using reserved BigQuery slots in the Google Cloud project where the alerting policy is defined. For more information, see Work with slot reservations.

For an alerting policy to use reserved BigQuery slots to query a log view, the log bucket that hosts the log view must be configured to have a linked BigQuery dataset. Linked datasets let BigQuery read the data in the log bucket, and they let you perform BigQuery functions on the data returned by your SQL query.

Before you begin

To get the permissions that you need to use Log Analytics, ask your administrator to grant you the following IAM roles on your log buckets or log views:
- To query the _Required and _Default log buckets: Logs Viewer (roles/logging.viewer).
- To query all log views in a project: Logs View Accessor (roles/logging.viewAccessor).
- To query logs in a specific log view: Create an IAM policy for the log view, or restrict the Logs View Accessor (roles/logging.viewAccessor) role to a certain log view. For more information, see Control access to a log view.
For information about additional roles, see Logging roles.
For the log views that you want to query, go to the Logs Storage page and verify that the log buckets that store those log views are upgraded to use Log Analytics. If necessary, upgrade the log bucket.

In the Google Cloud console, go to the Logs Storage page:

Go to Logs Storage

If you use the search bar to find this page, then select the result whose subheading is Logging.

To enable running queries on reserved BigQuery slots, do the following:
1. If the log bucket you plan to query doesn't have a linked dataset, then create a linked dataset for it.
2. Configure reserved BigQuery slots and assign them to your Google Cloud project.

To get the permissions that you need to create and manage SQL-based alerting policies, ask your administrator to grant you the following IAM roles:
- Monitoring Editor (roles/monitoring.editor) on your project
- Logging SqlAlertWriter (roles/logging.sqlAlertWriter) on your project
- BigQuery Job User (roles/bigquery.jobUser) on your project
- BigQuery Data Viewer (roles/bigquery.dataViewer) on your linked dataset
For information about granting access to a dataset, see Grant access to a dataset.
Ensure that the Monitoring Service Account exists and that it has the following roles:
1. Monitoring Service Agent (monitoring.notificationServiceAgent) role on your project.
2. BigQuery Data Viewer (roles/bigquery.dataViewer) role on your linked dataset.
If the Monitoring Service Account doesn't exist, then see Troubleshoot: No Monitoring Service Account.
Configure the notification channels that you want to use to receive any notifications for incidents. For redundancy purposes, we recommend that you create multiple types of notification channels. For more information, see Create and manage notification channels.

Create a SQL-based alerting policy

To create a SQL-based alerting policy, do the following:

Google Cloud console

In the Google Cloud console, go to the Log Analytics page:
Go to Log Analytics

If you use the search bar to find this page, then select the result whose subheading is Logging.
On the Log Analytics page, in the query editor, enter a SQL query for a log view.

Note: When you configure a query for an alerting policy, don't use the WHERE clause to filter log entries by timestamp. When Log Analytics evaluates the query, it automatically applies the correct timestamp filter to each evaluated log entry.

For more information about writing SQL queries for log views, see Query a log view.
On the toolbar, click Run on BigQuery.

Log Analytics runs your query on the BigQuery engine and displays the results in the Results table.

If Run on BigQuery isn't shown, then click Select query engine and then click BigQuery. The Run query button changes to Run on BigQuery.
On the Results table of the Log Analytics page, click Create alert.

The Log Analytics page shows the Create sql alert policy window, which shows your query under the SQL query section.
In the Alert condition section, configure the condition and schedule of your alerting policy.
Configure the alert details of your alerting policy.
1. Optional: Add alerting policy labels and documentation.
2. Add notification channels, and then click Next.
Review your alerting policy and then create it by clicking Save.

Cloud Monitoring API

Use the alertPolicies.create method to programmatically create alerting policies. The Condition type of your alerting policy must be conditionSql, which is an instance of SqlCondition. This condition type allows the conditions of your alerting policy to be defined with SQL.

To define the schedule, set a periodicity value for one of the minutes, hours, or days fields. For example, if you want the query to run every 12 hours, then set the periodicity of the hours field to 12.

To define the condition, use the following fields:

boolean_test: Configures the alerting policy so that its condition is met when a row of a boolean column in the query result table contains a true value.
row_count_test: Configures the alerting policy so that its condition is met when the number of rows in the query result table meets a certain threshold.

For a complete list of fields and definitions, see SqlCondition in the Cloud Monitoring API documentation.

For more information about the Monitoring API for alerting policies, see Managing alerting policies by API.

Limitations

When an alerting policy runs a scheduled SQL query, the alerting policy introduces a 15-minute delay to allow time for log entries to propagate to the log bucket. If log entries take more than 15 minutes to arrive, then the alerting policy doesn't evaluate them.

This introduction of a 15-minute delay impacts the time between when a log entry is generated and when Monitoring detects when a condition is met. For example, suppose you have a SQL query that is scheduled to run every 30 minutes. When the alerting policy periodically evaluates the condition, it queries log entries with timestamps between 15 and 45 minutes ago.

For limits associated with alerting policies, see Monitoring limits.

Pricing

For information about pricing, see the following documents:

BigQuery pricing
Cloud Logging in the Google Cloud Observability pricing documentation
Cloud Monitoring in the Google Cloud Observability pricing documentation

What's next

For information about creating charts from your Log Analytics data, see Chart query results with Log Analytics.