Get timely networking health updates with Personalized Service Health emerging incidents
Sudhanshu Jain
Product Manager, Google Cloud Networking
Percy Wadia
Group Product Manager, Google Cloud Networking
Organizations often use a variety of third-party tools to monitor the health and performance of their applications. But in the event of a service degradation, the source of the issue isn’t always clear — is it a disruption with your cloud provider, or a problem in your application environment? Recently, we announced the general availability of Personalized Service Health, which includes a new capability, emerging incidents, that provides speedy notification of Cloud Networking incidents to customers.
Emerging incidents are machine-driven alerts that are communicated simultaneously to you and internal Google SRE teams, significantly reducing the time-to-first-meaningful post about an incident. This provides customers with notifications at the same time as when Google Cloud incidents occur, even as our teams are still investigating the issues and assessing their impact. You can start receiving emerging incident notifications by enabling Personalized Service Health on supported Cloud Networking products, as well as set up alerts on them.
Emerging incident communications are sent in real time and personalized to your project, helping you address disruptions to operations and implement measures to mitigate the impact to your business. Of course, Personalized Service Health also sends timely updates on active incidents, making it a go-to resource for all incident information. The sooner you take action, the more you can reduce your mean time-to-resolution (MTTR), and improve application reliability. These real-time and personalized communications are shown below.
The health of Google Cloud networking products is continuously monitored using various probes. If the system detects a degradation or service interruption, it automatically generates internal alerts that are communicated to customers based on assessed impact, and sent out through all Personalized Service Health channels, including the dashboard, logs, alerts, and APIs.
Subsequently, if an event is confirmed, the emerging incident is closed out and linked to a confirmed incident. Alternatively, if the event was short-lived, e.g., a network re-route mitigated the impact to the customer, it may be closed before a confirmed event is even generated. These early alerts provide customers with clear information on the root of the issue Now, as long as incidents are active, customers receive timely updates about both emerging and confirmed incidents from Google Cloud’s incident response process.
Get started today
Emerging incidents for supported products are available by default for any projects that use them, as long as Personalized Service Health is enabled. You can learn more about managing emerging incidents here. Emerging incidents alert policies can be configured from within the Service Health dashboard.
For more information, follow the Personalized Service Health documentation and getting started guide. To get started, enable Personalized Service Health for a project or across your organization.