This page explains why you might not receive notifications as expected, and offers possible remedies for those situations.
Notifications aren't received
You configure notification channels and expect to be notified when incidents occur. You don't receive any notifications.
To gather information about the cause of the failure, do the following:
-
In the Google Cloud console, go to the Logs Explorer page:
If you use the search bar to find this page, then select the result whose subheading is Logging.
- Select the appropriate Google Cloud project.
Query the logs for notification channel events:
- Expand the Log name menu, and select notification_channel_events.
- Expand the Severity menu and select Error.
- Optional: To select a custom time range, use the time-range selector.
- Click Run query.
The previous steps create the following query:
resource.type:"stackdriver_notification_channel" logName="projects/PROJECT_ID/logs/monitoring.googleapis.com%2Fnotification_channel_events" severity=ERROR
The summary line and the
jsonPayload
field typically contain failure information. For example, when a gateway error occurs, the summary line includes "failed with 502 Bad Gateway".
Webhook notifications aren't received
You configure a webhook notification channel and expect to be notified when incidents occur. You don't receive any notifications.
Private endpoint
You can't use webhooks for notifications unless the endpoint is public.
To resolve this situation, use Pub/Sub notifications combined with a pull subscription to that notification topic.
When you configure a Pub/Sub notification channel, incident notifications are sent to a Pub/Sub queue that has Identity and Access Management controls. Any service that can query for, or listen to, a Pub/Sub topic can consume these notifications. For example, applications running on App Engine, Cloud Run, or Compute Engine virtual machines can consume these notifications.
If you use a pull subscription, then a request is sent to Google that waits for a message to arrive. These subscriptions require access to Google but they don't require rules for firewalls or inbound access.
Public endpoint
To identify why the delivery failed, examine your Cloud Logging log entries for failure information.
For example, you can search for log entries for the notification channel resource by using the Logs Explorer, with a filter like the following:
resource.type="stackdriver_notification_channel"
Pub/Sub notifications aren't received
You configure a Pub/Sub notification channel but you don't receive any notifications.
To resolve this condition, try the following:
Ensure that the notifications service account exists. Notifications aren't sent when the service account has been deleted.
To verify that the service account exists, do the following:
-
In the Google Cloud console, go to the IAM page:
If you use the search bar to find this page, then select the result whose subheading is IAM & Admin.
Search for a service account that has the following naming convention:
service-PROJECT_NUMBER@gcp-sa-monitoring-notification.iam.gserviceaccount.com
If this service account isn't listed, then select Include Google-provided role grants.
If the notifications service account doesn't exist, then you must begin the process of creating the Pub/Sub notification channel for Monitoring to create the service account:
-
In the Google Cloud console, go to the notifications Alerting page:
If you use the search bar to find this page, then select the result whose subheading is Monitoring.
- Click Edit notification channels.
In the Pub/Sub section, click Add new.
Monitoring creates the notifications service account when one doesn't exist. The Create Pub/Sub Channel dialog shows the name of the notifications service account.
If you don't want to add a notification channel, click Cancel. Otherwise, finish creating the notification channel and click Add channel.
Grant the service account permissions to publish your Pub/Sub topics:
- In a new browser tab, open the Create a notification channel document.
- Select the Pub/Sub tab, and then follow the steps in the Authorize service account section of the page.
-
Ensure that the notifications service account has been authorized to send notifications for the Pub/Sub topics of interest.
To view the permissions for a service account, you can use the Google Cloud console or the Google Cloud CLI command:
- The IAM page in the Google Cloud console lists the roles for each service account.
- The Pub/Sub Topics page in the Google Cloud console, lists each topic. When you select a topic, the Permissions tab lists the roles granted to service accounts.
To list all service accounts and their roles, run the following Google Cloud CLI command:
gcloud projects get-iam-policy PROJECT_ID
The following is a partial response for this command:
serviceAccount:service-PROJECT_NUMBER@gcp-sa-monitoring-notification.iam.gserviceaccount.com role: roles/monitoring.notificationServiceAgent - members: [...] role: roles/owner - members: - serviceAccount:service-PROJECT_NUMBER@gcp-sa-monitoring-notification.iam.gserviceaccount.com role: roles/pubsub.publisher
The command response includes only roles, it doesn't include per-topic authorization.
To list the IAM bindings for a specific topic, run the following command:
gcloud pubsub topics get-iam-policy TOPIC
The following is a sample response for this command:
bindings: - members: - serviceAccount:service-PROJECT_NUMBER@gcp-sa-monitoring-notification.iam.gserviceaccount.com role: roles/pubsub.publisher etag: BwXPRb5WDPI= version: 1
For information about how to authorize the notifications service account, see Authorize service account.
Notifications for uptime-check alerting policies aren't received
You want to be notified if a virtual machine (VM) reboots or shuts down, so you
create an alerting policy that monitors the metric
compute.googleapis.com/instance/uptime
.
You create and configure the condition to generate an incident when there
is no metric data. You don't define the condition by using
Monitoring Query Language (MQL)1.
You aren't notified when the virtual machine (VM) reboots or shuts down.
This alerting policy only monitors time series for Compute Engine VM instances
that are in the RUNNING
state. Time series for VMs that are in any other
state, such as STOPPED
or DELETED
, are filtered out before
the condition is evaluated. Because of this behavior, you can't use an
alerting policy
with a metric-absence alerting condition to determine if a VM instance
is running. For information on VM instance states, see
VM instance life cycle.
To resolve this problem, create an alerting policy to monitor an uptime check. For private endpoints, use private uptime checks.
A possible alternative to alerting on uptime checks is to use alerting policies that monitor the absence of data. We strongly recommend alerting on uptime checks instead of absence of data: metric-absence alerting policies can generate false positives if there are transient issues with the availability of Monitoring data.
However, if using uptime checks is not possible, you can create an alerting policy with MQL that notifies you the VM has been shut down. MQL-defined conditions don't pre-filter time-series data based on the state of the VM instance. Because MQL doesn't filter data by VM states, you can use it to detect the absence of data from VMs that have been shut down.
Consider the following MQL condition which monitors the
compute.googleapis.com/instance/cpu/utilization
metric:
fetch gce_instance::compute.googleapis.com/instance/cpu/utilization
|absent_for 3m
If a VM monitored by this condition is shut down,
then three minutes later, an incident is generated and
notifications are sent. The absent_for
value must be at
least three minutes.
For more information about MQL, see Alerting policies with MQL.
1: MQL is an expressive text-based language that can be used with Cloud Monitoring API calls and in the Google Cloud console. To configure a condition with MQL when you use the Google Cloud console, you must use the code editor.
Notifications for request-count alerting policies aren't received
You want to monitor the number of completed requests. You created an
alerting policy that monitors that monitors the metric
serviceruntime.googleapis.com/api/request_count
, but you
aren't notified when the number of requests exceeds the threshold you
configured.
The maximum alignment period for the request count metric is 7 hours 30 minutes.
To resolve this issue, check the value of the alignment period in your alert policy. If the value is longer than the maximum for this metric, reduce the alignment period so that it is no more than 7 hours 30 minutes.