Personalized Service Health overview

This document provides an overview of Personalized Service Health, which lets you identify Google Cloud service disruptions relevant to your projects so you can manage and respond to them efficiently. These disruptions are called service health events, and are available in the Google Cloud console and a variety of integration points.

How Personalized Service Health works

The following diagram shows how Personalized Service Health makes service health events available.

Personalized Service Health

You can access service health events with the following:

  • Service Health dashboard: Track emerging and active Google Cloud incidents relevant to your projects.
  • Service Health API: Pull service health event information per project or organization.
  • Alerts: Get notified of events relevant to your projects. Alerts are based on logs in Cloud Logging.
  • Logs: Export logs related to Google Cloud events.

View active and past Google Cloud incidents in the Service Health dashboard

The Service Health dashboard in the Google Cloud console shows incidents that are relevant to your project, their state, and the impacted Google Cloud products and locations.

See the quickstart to learn how to access the Service Health dashboard.

View Google Cloud incidents and receive alerts on a mobile device

The Service Health dashboard is also available on a mobile device.

To receive alerts on your mobile device, you can configure an alerting policy to send alerts to your mobile device.

Request service health events using the Service Health API

The Service Health API lets you get service health events that might be impacting or have impacted your project, or get event details such as updates, start/end times, impacted Google Cloud products and locations, and state.

See the Service Health API reference for more information.

Configure alerts or export logs through Cloud Logging

Personalized Service Health logs service health events in Cloud Logging, and lets you set up alerts based on these logs. You can set up alerts for conditions such as when new incidents are reported, when existing incidents are updated, or when incidents for specific Google Cloud products or locations are created or updated.

See the quickstart for setting up an alert in the Service Health dashboard.

Concepts

Personalized Service Health uses the following concepts to denote events affecting your project, and how these events are connected to your project.

Service health event

A service health event (v1,v1beta) is any disruptive event impacting a Google Cloud product that is relevant to your projects or resources. Examples include network outages, configuration errors, and performance issues.

Each event contains details about the overall impact of the event, updates from Google, and information specific to your Google Cloud project.

Incident

Incidents are emerging and active Google Cloud service outages or degradations relevant to your projects. It is a category of a service health event.

An incident includes the following:

  • Incident impact: Details of the scope of the event, such as impacted Google Cloud products and locations.
  • Updates from Google Cloud: Periodic updates from Google Cloud support.
  • Personalized relevance: Incident's relevance to your Google Cloud project.
  • Symptoms, workarounds, and ETAs: Information to help assess impact, apply a workaround, or learn more about the root cause.

Event states and detailed states

An event has two fields indicating their state. The values for these fields change as the event evolves.

  • Event state: Indicates the overall state of the event. It can be one of the following:

    • Active: Event is actively affecting Google Cloud and will continue to receive updates.
    • Closed: Event is no longer affecting any Google Cloud product, or has been merged with another event.
  • Detailed state: Provides more information on the state of the event. It applies to incidents only, and can be one of the following values depending on the event state:

    • Emerging: Google engineers are actively investigating the incident to determine the impact. An emerging incident will become either a confirmed or resolved incident once the impact assessment is complete. An active incident can be an emerging incident.

      Support for emerging incidents is available for Google Cloud networking products only.

    • Confirmed: The incident is confirmed by Google engineers and impacting at least one Google Cloud product. Ongoing status updates will be provided until it is resolved.

      An active incident can be a confirmed incident.

    • Merged: The incident was merged into a parent incident. All further updates will be published to the parent only.

    • Resolved: The incident is no longer affecting any Google Cloud product after action was taken. There will be no further updates.

      A closed incident is usually a resolved incident.

    • False positive: Upon investigation, Google engineers concluded that the incident is not affecting a Google Cloud product. This state can change if the incident is reviewed again.

    • Auto-closed: The incident was automatically closed because of the following reasons:

      • The impact of the incident could not be confirmed.
      • The incident was intermittent or resolved itself.

      The incident does not have a resolution because no action or investigation happened. If it is intermittent, the incident may reopen.

Relevance

Personalized Service Health assesses the impact of all incidents to your project. If the incident's impact to your project is possible or confirmed, it becomes available in the Service Health dashboard and API.

Relevance describes how an incident impacts your project. The relevance may change as the incident progresses.

Relevance can have the following values:

  • Impacted: The incident is verified to be impacting your project. Available for some Google Cloud products only.
  • Related: The incident has a direct connection with your project and impacts a Google Cloud product in a location your project uses.
  • Partially Related: The incident is associated with a Google Cloud product your project uses, but the incident may not be impacting your project. For example, the incident may be impacting a Google Cloud product that your project uses, but in a location that your project does not use.
  • Not Impacted: The incident is not impacting your project.
  • Unknown: The impact to your project is not known at this point.