Personalized Service Health concepts

This document describes the key concepts and terms used within Personalized Service Health (PSH). Understanding these concepts helps you effectively interpret events and configure alerts.

Service health event

A service health event (v1,v1beta) is any disruptive event impacting a Google Cloud product that is relevant to your projects or resources. Examples include network outages, configuration errors, and performance issues.

Each event contains details about the overall impact of the event, updates from Google, and information specific to your Google Cloud project.

Incident

Incidents are emerging and active Google Cloud service outages or degradations relevant to your projects. It is a category of a service health event.

An incident includes the following:

  • Incident impact: Details of the scope of the event, such as impacted Google Cloud products and locations.
  • Updates from Google Cloud: Periodic updates from Google Cloud support.
  • Personalized relevance: Incident's relevance to your Google Cloud project.
  • Symptoms, workarounds, and ETAs: Information to help assess impact, apply a workaround, or learn more about the root cause.

An incident may have an incident report, which includes the factors that contributed to the incident, and the steps Google Cloud plans to take to prevent similar incidents from reoccurring. Incident reports are available for incidents that meet the following conditions:

  • The incident has global impact or is affecting a significant percentage of customer projects across one or more regions.
  • One or more products are unavailable or severely degraded.

Event states and detailed states

An event has two fields indicating their state. The values for these fields change as the event evolves.

  • Event state: Indicates the overall state of the event. It can be one of the following:

    • Active: Event is actively affecting Google Cloud and will continue to receive updates.
    • Closed: Event is no longer affecting any Google Cloud product, or has been merged with another event.
  • Detailed state: Provides more information on the state of the event. It applies to incidents only, and can be one of the following values depending on the event state:

    • Emerging: Google engineers are actively investigating the incident to determine the impact. An emerging incident will become either a confirmed or resolved incident once the impact assessment is complete. An active incident can be an emerging incident.

      Support for emerging incidents is available for Google Cloud networking products only.

    • Confirmed: The incident is confirmed by Google engineers and impacting at least one Google Cloud product. Ongoing status updates will be provided until it is resolved.

      An active incident can be a confirmed incident.

    • Merged: The incident was merged into a parent incident. All further updates will be published to the parent only.

    • Resolved: The incident is no longer affecting any Google Cloud product after action was taken. There will be no further updates.

      A closed incident is usually a resolved incident.

    • False positive: Upon investigation, Google engineers concluded that the incident is not affecting a Google Cloud product. This state can change if the incident is reviewed again.

    • Auto-closed: The incident was automatically closed because of the following reasons:

      • The impact of the incident could not be confirmed.
      • The incident was intermittent or resolved itself.

      The incident does not have a resolution because no action or investigation happened. If it is intermittent, the incident may reopen.

Relevance

Personalized Service Health assesses the impact of all incidents to your project. If the incident's impact to your project is possible or confirmed, it becomes available in the Service Health dashboard and API.

Relevance describes how an incident impacts your project. The relevance may change as the incident progresses.

Relevance can have the following values:

  • Impacted: The incident is verified to be impacting your project. Available for some Google Cloud products only.
  • Related: The incident has a direct connection with your project and impacts a Google Cloud product in a location your project uses.
  • Partially Related: The incident is associated with a Google Cloud product your project uses, but the incident may not be impacting your project. For example, the incident may be impacting a Google Cloud product that your project uses, but in a location that your project does not use.
  • Not Impacted: The incident is not impacting your project.
  • Unknown: The impact to your project is not known at this point.