Managing log-based alerts

You can use log-based alerts to notify you whenever a specific message appears in your included logs. For example, if you want to know when an audit log records a particular data-access message, you can create a log-based alert that matches the message and notifies you when it appears. This document describes how to do the following, by using the Google Cloud console and the Cloud Monitoring API:

  • Create and test a log-based alert.
  • Edit a log-based alert.
  • Delete a log-based alert.

Before you begin

Review Alerting comparison to determine if log-based alerts are a good fit for the data in your logs. For example:

  • Log-based alerts don't operate on excluded logs.

  • You can't use log-based alerts to derive counts from your logs. To derive counts, you need to use log-based metrics instead.

To create and manage log-based alerts, you must have the authorization described in Log-based alerts permissions.

Create a log-based alert (Logs Explorer)

You can create log-based alerts from the Logs Explorer page in the Google Cloud console or by using the Monitoring API. This section describes how to create log-based alerts by using Logs Explorer. To create log-based alerts by using the Monitoring API, see Create a log-based alert (Monitoring API).

The Logs Explorer interface for creating and editing log-based alerts guides you through the following steps:

  • Provide a name and description for the alert.
  • Choose the logs for which you want to receive a notification.
  • Set the time between notifications.
  • Set the time for automatic closure of incidents.
  • Specify whom to notify.

For example, assume that you have an application that writes a syslog log entry with NOTICE severity when the application changes a network address. The log entries for network-address changes include a JSON payload that looks like the following:

"jsonPayload": {
  "type": "Configuration change",
  "action": "Set network address",
  "result": "IP_ADDRESS",
}

You want to create a log-based alert that notifies you when an invalid IPv4 address appears in the jsonPayload.result field of log entries in syslog with NOTICE severity.

To create the log-based alert, do the following:

  1. In the Google Cloud console, select Logging, and then select Logs Explorer:

    Go to the Logs Explorer

  2. Use the Query pane to build a query that matches the message you want to use in your log-based alert.

    For example, to find log entries with a severity level of NOTICE in the syslog log that have invalid IP addresses in the JSON payload, you can use the following query:

    log_id("syslog")
    severity = "NOTICE"
    jsonPayload.result !~ "^((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\.|$)){4}$"
    

    Use Run query in the Query results pane to validate the query.

  3. In the header of the Query results pane, click  Create alert. When your window is narrow, the Create alert option might appear on the Actions menu instead.

  4. In the Alert details pane, give the alert a name and description:

    1. Enter a name for your alert in the Alert Name field. For example: "Network address: invalid IPv4 value".

    2. Enter a description of this alert. You can also include information that might help the recipient of a notification diagnose the problem. The following string summarizes the reason for the alert:

      Log-based alert in project ${project} detected an invalid IPv4 value.
      

      For information about how you can format and tailor the content of this field, see Using Markdown and variables in documentation templates.

  5. To advance to the next step, click Next.

  6. In the Choose logs to include in the alert pane, check the query and results by clicking Preview logs.

    We recommend building the query in the Logs Explorer Query pane. The query you built in the Query pane is also displayed on this pane, for example:

    log_id("syslog")
    severity = "NOTICE"
    jsonPayload.result !~ "^((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\.|$)){4}$"
    

    You can edit the query in this pane, if necessary. If you edit the query, then check the results by clicking Preview logs.

  7. Click Next.

  8. Select the minimum time between notifications. This value lets you control the number of notifications you get from this alert if it is triggered multiple times. For this example, select 5 min from the options.

    You can leave the incident auto-close duration at the default value, or you can set it to a different value by selecting a different option on the menu.

  9. Click Next.

  10. Select one or more notification channels for your alert. For this example, select an email notification channel.

    If you already have an email notification channel configured, then you can select it from the list. If not, click Manage notification channels and add an email channel. For information about creating notification channels, see Managing notification channels.

  11. Click Save.

Your log-based alert is now ready to test.

Test the example log-based alert

To test the alert you created, you can manually write a log entry that matches the query. To write the log entry, do the following:

  1. Configure the following log entry by changing the PROJECT_ID variable to your project ID:

    {
      "entries": [
      {
        "logName": "projects/PROJECT_ID/logs/syslog",
        "jsonPayload": {
          "type": "Configuration change",
          "action": "Set network address",
          "result": "999.027.405.1",
        },
        "severity": "NOTICE",
        "resource": {
          "type": "generic_task",
          "labels" : {
            "project_id": "PROJECT_ID",
            "location": "us-east1",
            "namespace": "fake-task-2",
            "job": "write-log-entry",
            "task_id": "11",
          },
        },
      ],
    }
    
  2. Go to the logEntries.write reference page, or click the following button:

    Go to logEntries.write

  3. Copy the log entry you configured previously.

  4. In the Try this API pane, do the following:

    1. Replace the content of the Request body field in APIs Explorer with the log entry you copied in the previous step.

    2. Click Execute. If prompted, follow the authentication flow.

      If the logEntries.write call is successful, then you get an HTTP 200 response code and an empty response body, {}. For more information about APIs Explorer, see Using the APIs Explorer in the Monitoring documentation; the APIs Explorer works the same way with the Logging API.

The log entry matches the filter specified for the alert in the following ways:

  • The logName value specifies the syslog log in your Cloud project.
  • The severity value for this log entry is NOTICE.
  • The jsonPayload.result value is not a valid IPv4 address.

After you write the log entry, the following sequence occurs:

  • The new log entry appears in the Logs Explorer and triggers the alert.
  • An incident is opened in Cloud Monitoring.
  • You receive a notification for the incident. If you configured an email notification channel, then the notification looks like the following screenshot:

    The example log-based alert results in an email notification.

You can click View incident in the email to see the incident in Cloud Monitoring. For more information about incidents, see Managing incidents for log-based alerts.

Other scenarios: Alerting on audit logs

The example in Creating a log-based alert is artificial; you don't typically create an alert and then manually write log entries to trigger the alert. Log entries are usually written by applications or other services. But the source of the log entries doesn't matter; for log-based alerts, what matters is the query that you use to select the log entries.

The following sections describe realistic scenarios for log-based alerts based on the content of audit logs. Each scenario illustrates how to create a query that isolates the desired audit-log entries. Otherwise, the procedure for creating the log-based alerts is the same as shown in Creating a log-based alert.

Alerts on human access of secrets

Suppose that your project stores secrets in Secret Manager, and some of these secrets are intended only for service accounts to use. Except in unusual circumstances, human users never access these secrets.

If you have enabled audit logging for Secret Manager, then each successful attempt to access a secret creates an audit log entry. Each entry includes the name of the secret and the caller's identity.

You can create a log-based alert that notifies you when a human user accesses a secret.

The following shows an excerpt of an audit log entry written by Secret Manager. The excerpt shows the fields that are useful for creating the query for a log-based alert:

{
  "protoPayload": {
    "@type": "type.googleapis.com/google.cloud.audit.AuditLog",
    "serviceName": "secretmanager.googleapis.com",
    "methodName": "google.cloud.secretmanager.v1.SecretManagerService.AccessSecretVersion",
    "authenticationInfo": {
      "principalEmail": "my-svc-account@PROJECT_ID.iam.gserviceaccount.com",
      "serviceAccountDelegationInfo": [],
      "principalSubject": "serviceAccount:my-svc-account@PROJECT_ID.iam.gserviceaccount.com"
    },
    ...
  },
  ...
}

The following protoPayload subfields are of particular interest:

  • @type: indicates that this log entry is an audit log entry.
  • serviceName: records the service that wrote the audit log entry. Use this field to identify entries written by Secret Manager.
  • methodName: identifies the method for which this audit log entry was written. Use this field to identify the action that caused this entry to be created. In this example, it's the AccessSecretVersion method.
  • authenticationInfo.principalEmail: records the account that invoked the method in the methodName field. The expected value for this field is a service account, which ends with gserviceaccount.com.

To find log entries for secret access by a human user, look for audit log entries written by Secret Manager. You want to find the log entries in which the AccessSecretVersion method was invoked by a principal that doesn't end with gserviceaccount.com. The following query isolates these log entries:

protoPayload.@type = "type.googleapis.com/google.cloud.audit.AuditLog"
protoPayload.serviceName = "secretmanager.googleapis.com"
protoPayload.methodName =~ "AccessSecretVersion$"
protoPayload.authenticationInfo.principalEmail !~ "gserviceaccount.com$"

To create a log-based alert for human access of secrets, use this query in the Choose logs to include in the alert pane.

Alerts on decryption events

The analysis in the previous example can be adapted to other services. For example, if you use Cloud Key Management Service to encrypt and decrypt sensitive data, then you can use audit logs generated by Cloud KMS to detect when a human user decrypts a value.

To find log entries for decryption done by a human user, look for audit log entries written by Cloud KMS. You want to find the log entries in which the Decrypt method was invoked by a principal that doesn't end with gserviceaccount.com, which indicates a service account. The following query isolates these log entries:

protoPayload.@type = "type.googleapis.com/google.cloud.audit.AuditLog"
protoPayload.serviceName = "cloudkms.googleapis.com"
protoPayload.methodName = "Decrypt"
protoPayload.authenticationInfo.principalEmail !~ "gserviceaccount.com$"

To create a log-based alert for decryption done by a human user, use this query in the Choose logs to include in the alert pane.

Manage log-based alerts in Monitoring

You can view, edit, and delete log-based alerts by using the Google Cloud console for Monitoring or by using the Monitoring API. This document describes how to manage alerting policies by using the Google Cloud console. For information about using the Monitoring API to manage alerting policies, see Managing alerting policies by API.

To see a list of all the alerting policies in your Google Cloud project, do one of the following:

  • To navigate from Logging:

    1. In the Google Cloud console, select Logging, then select Logs explorer:

      Go to the Logs Explorer

    2. In the header of the Query results pane, Actions menu and select Manage alerts.

  • To navigate from Monitoring:

    1. In the Google Cloud console, select Monitoring:

      Go to Monitoring

    2. Select Alerting.

    3. A partial list of policies is shown in the Policies pane. To see all policies and to enable filtering, click See all policies.

Both of these actions take you to the Monitoring Policies page, which lists all the alerting policies in your Cloud project.

To restrict the alerting policies that are listed, add filters. Each filter is composed of a name and a value. For example, you can set the value to be an exact match for a policy name, or a partial match. Matches are not case-sensitive. If you specify multiple filters, then the filters are implicitly joined by a logical AND unless you insert an OR filter. The following screenshot lists the currently enabled alerting policies created after January 1, 2021:

List of enabled alerting policies created after January 1, 2021.

From the Policies page you can edit, delete, copy, enable, or disable an alerting policy:

  • To edit or copy a policy, click More options, and select the desired option. Editing and copying a policy is similar to the procedure described in Creating a log-based alert . You can change and, in some cases, delete the values in the fields. When done, click Save.

    You can also edit a log-based alerting policy by clicking its name in the list of policies.

  • To delete a policy, click More options and select Delete. In the confirmation dialog, select Delete.

  • To enable or disable the alerting policy, click the toggle located under the heading Enabled.

Create a log-based alert (Monitoring API)

You can create log-based alerts by using the Monitoring API. You provide the same information to the Monitoring API that you provide when you use the Logs Explorer in the Google Cloud console:

  • A name and description for the alert.
  • The logs for which you want to receive a notification.
  • The time between notifications.
  • The time for automatic closure of incidents.
  • Whom to notify.

To create alerting policies by using the Monitoring API, you create an AlertPolicy object and submit it to the alertPolicies.create method.

Before you can use the Monitoring API, you must enable the API and have authorization to use it. For more information, see the following documentation:

Structure of alerting policies

The Monitoring API represents an alerting policy by using the AlertPolicy structure. The AlertPolicy structure has several embedded structures, including a description of the condition that triggers the alert. Log-based alerting policies differ from metric-based alerting policies in the following ways:

  • You describe the condition by using the LogMatch condition type. Metric-based alerting policies use different condition types.
  • A log-based alerting policy can have only one condition.
  • You specify the time between notifications and the automatic incident-closure period by including an AlertStrategy structure. Metric-based alerting policies do not include a time between notifications.

This section describes how to create a log-based alerting policy. These policies differ from metric-based alerting policies in the type of condition you use. For log-based alerts, the condition type is LogMatch. When you use the Monitoring API to manage alerting policies, there are no differences in how you list, edit, or delete metric and log-based policies. Managing alerting policies by API describes how to create, list, edit, and delete alerting policy by using the Monitoring API.

Design the alerting policy

In Create a log-based alert (Logs Explorer), you create a log-based alert that notifies you when an invalid IPv4 address appears in the jsonPayload.result field of log entries in syslog with NOTICE severity.

To create the same log-based alert by using the Monitoring API, you create an AlertPolicy object that looks like the following JSON structure:

{
  "displayName": "Network address: invalid IPv4 value (API)",
  "documentation": {
    "content": "Log-based alert in project ${project} detected an invalid IPv4 value.",
    "mimeType": "text/markdown"
  },

  "conditions": [
    {
      "displayName": "Log match condition: invalid IP addr (API)",
      "conditionMatchedLog": {
        "filter": "log_id(\"syslog\") severity = \"NOTICE\" jsonPayload.result !~ \"^((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\\.|$)){4}$\"",
      },
    }
  ],
  "combiner": "OR",

  "alertStrategy": {
    "notificationRateLimit": {
      "period": "300s"
    },
    "autoClose": "604800s",
  },

  "notificationChannels": [
    "projects/PROJECT_ID/notificationChannels/CHANNEL_ID"
  ]
}

This JSON code specifies the same information that you specify when creating a log-based alert by using Logs Explorer. The following sections map the contents of this AlertPolicy structure to the steps you follow when using Logs Explorer to create a log-based alert. The value of the conditionMatchedLog field is a LogMatch structure.

Provide a name and description

An alerting policy has a display name and associated documentation that is provided with notifications to assist responders. In the Logs Explorer, these fields are called Alert Name and Alert Description. You represent these values in an AlertPolicy structure as follows:

{
  "displayName": "Network address: invalid IPv4 value (API)",
  "documentation": {
    "content": "Log-based alert in project ${project} detected an invalid IPv4 value.",
    "mimeType": "text/markdown"
  },
  ...
}

In this example, the value for displayName includes "(API)" so that you can distinguish between the two example policies when viewing the list of policies in the Google Cloud console. The Monitoring Policies page lists policies by display name and indicates whether the policy is based on metrics or logs. For more information, see Manage log-based alerts in Monitoring.

The documentation field includes, in the content subfield, the description you might supply when using Logs Explorer. The second subfield, mimeType is required when you specify a value for the documentation field. The only valid value is "text/markdown".

Choose the logs for which you want to receive a notification

A log-based alerting policy has a single condition. In the Logs Explorer, you specify the condition when you supply a query in the Define log entries to alert on field. You represent these values in an AlertPolicy structure as follows:

{ ...
  "conditions": [
    {
      "displayName": "Log match condition: invalid IP addr (API)",
      "conditionMatchedLog": {
        "filter": "log_id(\"syslog\" severity = \"NOTICE\" jsonPayload.result !~ \"^((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\\.|$)){4}$\"",
      },
    }
  ],
  "combiner": "OR",
  ...
}

The conditions field takes a list of Condition structures, although a log-based alerting policy must have only one condition. Each Condition has a display name and a description of the condition.

  • The value of the displayName field is a brief description of the condition. When you use the Logs Explorer to create log-based alerts, the display name is always "Log match condition". When you use the Monitoring API, you can provide a more precise display name. A value is required.

  • The value of the conditionMatchedLog field is a LogMatch structure, and the value of the filter field is the query you specify in the Logs Explorer. Because this query is provided as the value of a JSON field, the entire query appears in quotes, and any quotes in the query itself must be escaped with the \ (backslash) character.

The value for the combiner field specifies how to combine the results of multiple conditions in metric-based alerting policies. You can only use one condition in log-based alerts, and you must specify the combiner field with the value "OR". You can't create log-based alerts with multiple conditions.

Set the notification and auto-close values

An alerting policy for a log-based alert specifies the minimum time between notifications. In the Logs Explorer, you select a value from the Time between notifications menu. You represent this value in an AlertPolicy structure by specifying a value, in seconds, for the period field of a NotificationRateLimit structure embedded in an AlertStrategy structure.

Similarly, the alerting policy includes the period for automatically closing incidents. The default value is 7 days. In the Logs Explorer, you can select a different value from the Incident autoclose duration menu. The option corresponds to the autoclose field in the AlertStrategy API structure. When you use this field, specify the value in seconds. The minimum value is 1,800 seconds, or 30 minutes.

{ ...
  "alertStrategy": {
    "notificationRateLimit": {
      "period": "300s"
    },
    "autoClose": "604800s",
  },
  ...
}

The value for the period field in this example, 300s, is equivalent to 5 minutes. The autoclose value, 604800s, is equivalent to 7 days.

Specify whom to notify

An alerting policy can include a list of notification channels. In the Logs Explorer, you select channels from a menu. You represent these values in an AlertPolicy structure by providing a list of one or more resource names for configured NotificationChannel objects:

{ ...
  "notificationChannels": [
    "projects/PROJECT_ID/notificationChannels/CHANNEL_ID"
  ]
}

When you create a notification channel, it is assigned a resource name. For information about retrieving the list of available notification channels, which includes their resource names, see Retrieving channels in the Monitoring documentation. You can't get the channel IDs by using the Google Cloud console.

Send your alerting policy to the Monitoring API

To create an alerting policy by using the Monitoring API, you create an AlertPolicy object and submit it to the alertPolicies.create method. You can invoke the alertPolicies.create by using the Google Cloud CLI, calling the Monitoring API directly.

You can also create log-based alerts by using the client libraries for C#, Go, Java, Python, and Ruby. You might also be able to use other client libraries; the library for your language must include the LogMatch condition type.

To create an alerting policy by using the gcloud CLI, do the following:

  1. Put the JSON representation of your alerting policy into a text file, for example, into a file called alert-invalid-ip.json.

  2. Pass this JSON file to the gcloud CLI using the following command:

    gcloud alpha monitoring policies create --policy-from-file="alert-invalid-ip.json"
    
  3. If successful, this command returns the resource name of the new policy, for example:

    Created alert policy [projects/PROJECT_ID/alertPolicies/POLICY_ID].
    

To create an alerting policy by calling alertPolicies.create directly, you can use the APIs Explorer tool as follows:

  1. Go to the alertPolicies.create reference page.

  2. In the Try this API pane, do the following:

    1. In the name field, enter the following value:

      projects/PROJECT_ID
      
    2. Copy the JSON representation of your alerting policy and replace the contents of the Request body field in APIs Explorer with the copied alerting policy.

    3. Click Execute.

      If the alertPolicies.create call is successful, then you get an HTTP 200 response code and an empty response body, {}. For more information about APIs Explorer, see Using the APIs Explorer in the Monitoring documentation.

For more information about creating alerting policies by using the Monitoring API, see Creating policies. The examples in that document use condition types for metric-based alerting policies, but the principles are the same.

Test the alerting policy

To test your new alerting policy, you can use the same procedure described in Test the example log-based alert.