Manage host maintenance events for X4 instances running SAP HANA

This document describes how you can manage and monitor planned host maintenance events for your Compute Engine X4 instances that run SAP HANA workloads.

X4 is a specialized series of Compute Engine bare metal machine types that is designed to run multi-terabyte SAP HANA workloads. X4 maintenance is required for regular software and firmware updates. This maintenance ensures optimal, secure, and reliable performance of your X4 instances.

Unlike other Compute Engine machine types, X4 doesn't support the live migration process during maintenance events. This means that, for planned host maintenance events, Google must stop and update X4 instances instances, making these events crucial for your SAP HANA workloads.

Maintenance lifecycle

A planned host maintenance event begins with a 60-day advance notification to you. Within this window, you can trigger the host maintenance event. If you don't trigger the host maintenance event before its planned start date and time, then Google automatically triggers it on the planned start date, at the planned start time or within a few minutes of it.

A planned host maintenance event typically lasts up to 4 hours, during which the running instance on the host is stopped and restarted.

The approximate frequency of planned host maintenance events for X4 instances is a minimum of 90 days. This means that once a planned maintenance is completed, then it is assured that the next planned maintenance event occurs at least after 90 days. However, unscheduled maintenance might still occur based on the criticality of the underlying issue.

The following sections describe the two ways in which a planned host maintenance event is triggered: manually by you or automatically by Google.

Maintenance event manually triggered by you

The following steps show an example sequence of actions that occur in the scenario where you trigger a planned host maintenance event:

  1. On day 0, you deploy an X4 instance.
  2. On day 36, you receive from Google the notification about a planned host maintenance event.

    If you look up your X4 instance's description by using the gcloud compute instances describe command, then you see "maintenanceStatus": "PENDING" in its response.

    Between day 36 and 96 (the 60-day advance notification period), you have the option to trigger the host maintenance event.

  3. On day 80, you trigger the host maintenance event for your X4 instance. For example, you run the gcloud compute instances perform-maintenance command.

    Google powers down your X4 instances for maintenance. The maintenance duration is usually 4 hours.

    You can use the gcloud compute instances describe command to see that the instance's maintenanceStatus field is set to ONGOING.

  4. Once the maintenance activities are complete, Google restarts your X4 instances.

The earliest when you can receive notification about the next planned host maintenance event is at least 30 days after the completion of this maintenance event. In this example, Google sends you notification about the next planned host maintenance event on day 120.

The following diagram illustrates the preceding set of steps:

Diagram showing sequence of actions in a customer-triggered planned host maintenance event

Maintenance event automatically triggered by Google

The following steps show an example sequence of actions that occur in the scenario where Google triggers a planned host maintenance event. Google triggers a host maintenance event on the event's planned start day only when you don't trigger the event during the 60-day advance notification period between the day you're notified about the event and the event's planned start date.

  1. On day 0, you deploy an X4 instance.
  2. On day 45, you receive from Google the notification about a planned host maintenance event.

    If you look up your X4 instance's description by using the gcloud compute instances describe command, then you see "maintenanceStatus": "PENDING" in its response.

  3. On day 105, which is the host maintenance event's planned start date, Google triggers the host maintenance event. Maintenance typically starts at the planned start time or within a few minutes of it.

    If you look up your X4 instance's description by using the gcloud compute instances describe command, then you see "maintenanceStatus": "ONGOING" in its response.

  4. Google powers down your X4 instances for maintenance. The maintenance window is usually 4 hours.

  5. Once the maintenance activities are successfully completed, Google restarts your X4 instances.

The earliest when you can receive notification about the next planned host maintenance event is at least 30 days after the completion of this maintenance event. In this example, Google sends you notification about the next planned host maintenance event on day 150.

The following diagram illustrates the preceding set of steps:

Diagram showing sequence of actions in a Google-triggered planned host maintenance event

View information about a maintenance event

For each planned host maintenance event, Google sends you a 60-day advance notification. All planned host maintenance events for X4 are classified as scheduled maintenance.

To view information about a planned host maintenance event, you can do the following:

  • Query your X4 instance by using the Google Cloud CLI
  • Query your X4 instance by using REST API
  • Query your X4 instance's metadata server
  • Check the logs in Cloud Logging

For information about how to perform these actions, including the required IAM roles and permissions, see Monitor and plan for a host maintenance event.

Simulate a maintenance event

To observe the end-to-end process of a planned host maintenance event, or to test any integration or automation that you might have implemented, you can simulate a host maintenance event for your X4 instance by using the gcloud CLI or REST API.

For information about how to simulate a planned maintenance event, see Simulate host maintenance for compute instances that terminate.

Trigger a maintenance event

You can trigger a planned host maintenance event at any time before the 60-day advance notification period concludes. You can do this by using the gcloud CLI or REST API.

To trigger a host maintenance event, don't use the gcloud CLI or REST API resources that stop and start Compute Engine instances.

For information about how to trigger a planned host maintenance event, or how to check its status, see Manually start a host maintenance event.

Verify the completion of a maintenance event

To verify the successful completion of a planned host maintenance event for your Compute Engine X4 instance, you can do the following:

  • Query you instance by using the gcloud CLI or REST API. The response won't include the upcomingMaintenance field.

    For information about how to query your instance, see Check instances for a maintenance event notification.

  • In Cloud Logging, check the logs for your instance. You see a log message similar to the following:

    Maintenance window is completed for this instance. All maintenance notifications on the instance has been removed.

    For information about how to search the logs for your instance, see Check Cloud Logging for a maintenance event notification.

Monitor maintenance events

Setting up monitoring for planned host maintenance events of your Compute Engine X4 instances can help in keeping your team informed about the status of ongoing events and also about upcoming events.

Because each maintenance event sends multiple messages to Cloud Logging, you can set up a log-based alerting policy to search for specific maintenance event notifications and send alerts by using a notification channel.

For information about how to configure alerts for planned host maintenance events, see Configure alerts for host maintenance notifications.