Compute Engine does regular maintenance of its infrastructure. This page describes the types and approximate frequencies of these maintenance events, and how you can set instance availability options to configure the behavior of VM instances when these maintenance events occur. This page also describes how to set an instance to live migrate when a maintenance event occurs.
Before you begin
- If you want to use the command-line examples in this guide:
- Install or update to the latest version of the gcloud command-line tool.
- Set a default region and zone.
- If you want to use the API examples in this guide, set up API access.
Maintenance events
Compute Engine maintenance events entail hardware and software updates. Some of these maintenance events require Google to move your VM away from the host that is undergoing maintenance and Compute Engine automatically manages the scheduling behavior of these instances. Compute Engine will live migrate your VM instances if you configured the instance's availability policy to use live migration. This prevents your applications from experiencing disruptions during these events. Alternatively, you can also choose to stop your instances during these events rather than live migrating them.
The following table broadly categorizes Compute Engine maintenance events into two categories, illustrates each with examples, and signifies which maintenance event requires Live Migration of your VM to a different host.
Maintenance event type | Examples | Approximate frequency * | Requires live migration to new host |
---|---|---|---|
Host maintenance | Host kernel upgrade, hardware repair or upgrade | Once every two weeks | Yes |
Lightweight | Hypervisor-level upgrade, networking stack upgrade | 1-2 times per week | No |
* Note that these frequencies are approximations, not guarantees. Compute Engine may occasionally perform maintenance more frequently than mentioned here.
Choosing availability policies
A VM instance's availability policy determines how it behaves when there is a maintenance event where Google must move your VM instance to another host machine. You can configure your VM instances to continue running while Compute Engine live migrates them to another host or you can choose to stop your instances instead. You can update an instance's availability policy at any time to control how you want your VM instances to behave.
You can change an instance's availability policy by configuring the following two settings:
- The VM instance's maintenance behavior, which determines whether the instance is live migrated or stopped when there is a maintenance event.
- The instance's restart behavior, which determines whether the instance automatically restarts if it crashes or gets stopped.
The default maintenance behavior for instances is to live migrate, but you can change the behavior to stop your instance during maintenance events instead.
Live migrate
By default, standard instances are set to live migrate, where Compute Engine automatically migrates your instance away from an infrastructure maintenance event, and your instance remains running during the migration. Your instance might experience a short period of decreased performance, although generally most instances should not notice any difference. This is ideal for instances that require constant uptime, and can tolerate a short period of decreased performance.
When Compute Engine migrates your instance, it reports a system event that is published to the list of zone operations. You can review this event by viewing the Compute Engine operations for a specific zone. Live migration events have the following operation type:
compute.instances.migrateOnHostMaintenance
Stop and (optionally) restart
If you do not want your instance to live migrate, you can choose to stop and optionally restart your instance. With this option, Compute Engine will signal your instance to shut down, wait for a short period of time for your instance to shut down cleanly, stop the instance, and restart it away from the maintenance event. This option is ideal for instances that demand constant, maximum performance, and your overall application is built to handle instance failures or reboots.
When Compute Engine stops and reboots your instances, it reports a system event that is published to the list of zone operations. You can review this event by viewing the Compute Engine operations for a specific zone. Stopped events have the following operation type:
compute.instances.terminateOnHostMaintenance
When your instance reboots, it uses the same persistent boot disk and reattaches any secondary persistent disks that you configured. The data on those disks persists through instance migration and restart.
Local SSD data does not persist through instance stopping. When the instance restarts, it creates a new Local SSD that you must format and mount.
Automatic restart
If your instance is set to stop when there is a maintenance event, or if your
instance crashes because of an underlying hardware issue, you can set
up Compute Engine to automatically restart the instance by setting the
automaticRestart
field to true
. This setting does not apply if the
instance is taken offline through a user action, such as calling
sudo shutdown
, or during a zone outage.
When Compute Engine automatically restarts your instance, it reports a system event that is published to the list of zone operations. You can review this event by viewing the Compute Engine operations for a specific zone. Automatic restart events have the following operation type:
compute.instances.automaticRestart
Viewing Compute Engine operations
You can view a list of completed operations through the
Google Cloud Console, the
gcloud
command-line tool, or the
Compute Engine API.
Console
To view a list of operations for your project, go to the Operations page.
- For more details on an operation, click on the operation summary. For
example, to view the migration details for the
my-instance
instance, click on the Automatically migrate an instance operation.
gcloud
To view a list of operations for your project
using gcloud compute
, use the operations list
sub-command.
To view the list of operations in a specified zone, add the --filter
flag.
gcloud compute operations list --filter="zone:(ZONE)"
Replace ZONE
with the zone where you want to view a list of
operations. For example, to view the list of operations in us-cental1-c
,
run the following command:
gcloud compute operations list --filter="zone:(us-central1-c)"
NAME TYPE TARGET HTTP_STATUS STATUS TIMESTAMP
systemevent-1543845145000... compute.instances.migrateOnHostMaintenance us-central1-c/instances/my-instance 200 DONE 2018-12-03T05:52:25.000-08:00
API
API requests for operations must be specified at either the global, region, or zone level. Live migration, instance stopping, and automatic restarts are all zone level operations.
For zone operations, make a GET
request to the zoneOperations.list
method.
GET https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/operations
Replace the following:
PROJECT_ID
: the project ID for this requestZONE
: the zone for this request
Leave the request body empty.
The following is a sample output for a zone operation request. In this output, details for a host migration displays.
{ "kind": "compute#operation", "id": "3216798767364213712", "name": "systemevent-1543845145000-57c1e7574b840-a195b637-5ff74d9b", "zone": "https://www.googleapis.com/compute/v1/projects/my-project/zones/us-central1-c", "operationType": "compute.instances.migrateOnHostMaintenance", "targetLink": "https://www.googleapis.com/compute/v1/projects/my-project/zones/us-central1-c/instances/my-instance", "targetId": "3070988523247098025", "status": "DONE", "statusMessage": "Instance migrated during Compute Engine maintenance.", "user": "system", "progress": 100, "insertTime": "2018-12-03T05:52:25.000-08:00", "startTime": "2018-12-03T05:52:25.000-08:00", "endTime": "2018-12-03T05:52:25.000-08:00", "selfLink": "https://www.googleapis.com/compute/v1/projects/my-project/zones/us-central1-c/operations/systemevent-1543845145000-57c1e7574b840-a195b637-5ff74d9b" }
Setting availability policies
Configure an instance's maintenance behavior and automatic restart
setting using the onHostMaintenance
and automaticRestart
properties.
All instances are configured with default values unless you explicitly
specify otherwise.
onHostMaintenance
: Determines the behavior when a maintenance event occurs that might cause your instance to reboot.- [Default]
MIGRATE
, which causes Compute Engine to live migrate an instance when there is a maintenance event. TERMINATE
, which stops an instance instead of migrating it.
- [Default]
automaticRestart
: Determines the behavior when an instance crashes or is stopped by the system.- [Default]
true
, so Compute Engine restarts an instance if the instance crashes or is stopped. false
, so Compute Engine does not restart an instance if the instance crashes or is stopped.
- [Default]
You can change the availability policies of an instance when you first
create an instance
or after the instance is created,
using the setScheduling
method.
Setting options during instance creation
Console
- In the Cloud Console, go to the VM instances page.
- Click Create instance.
- On the Create a new instance page, fill in the properties for your instance.
- Expand the Management, security, disks, networking, sole tenancy option.
- Under Availability policy, set the Automatic restart and On host maintenance options.
- Click Create to create the instance.
gcloud
To specify the availability policies of a new instance in gcloud compute
, use
the --maintenance-policy
flag to specify whether the instance is
migrated or stopped. By default, instances are automatically set
to restart unless you provide the --no-restart-on-failure
flag.
gcloud compute instances create INSTANCE_NAME \
--maintenance-policy MAINTENANCE_POLICY \
[--no-restart-on-failure]
Replace the following:
INSTANCE_NAME
: the instance nameMAINTENANCE_POLICY
: the policy for this instance, eitherTERMINATE
orMIGRATE
API
In the API, make a POST
request to the following URL, replacing the
project and zone with your project ID and the zone of the instance:
https://compute.googleapis.com/compute/v1/projects/myproject/zones/us-central1-f/instances
with the onHostMaintenance
and automaticRestart
parameters as part of the
request body:
{
"name": "example-instance",
"description": "Front-end for real-time ingest; don't migrate.",
...
// User options for influencing this Instance’s life cycle.
"scheduling": {
"onHostMaintenance": "MIGRATE",
"automaticRestart": true
}
}
For more information, see the Instances reference documentation.
Updating options for an instance
Console
- Go to the VM Instances page in the Google Cloud Console.
- Click the instance for which you want to change settings. The instance details page displays.
- From the instance details page, complete the following steps:
- Click the Edit button at the top of the page.
- Under Availability policies, update the policy as needed. From the Availability policies section, you can set the On host maintenance and Automatic restart options.
- Click Save.
gcloud
To update the availability policies of an instance, use the
instances set-scheduling
command with the same parameters and flags used in the instance
creation command above:
gcloud compute instances set-scheduling INSTANCE_NAME \
--maintenance-policy MAINTENANCE_POLICY \
[--no-restart-on-failure | --restart-on-failure]
Replace the following:
INSTANCE_NAME
: the instance nameMAINTENANCE_POLICY
: the policy for this instance, eitherTERMINATE
orMIGRATE
API
In the API, you can make a request to the following URL, replacing the project ID, zone, and instance name with your own project ID, zone, and instance name:
https://compute.googleapis.com/compute/v1/projects/example-project/zones/us-central1-f/instances/example-instance/setScheduling
The body of your request must contain the new value for the availability policies:
{
"onHostMaintenance": "MIGRATE",
"automaticRestart": true
}
For more information, see the
instances().setScheduling
reference documentation.
Testing your availability policies
After you set your availability policies, you can simulate maintenance events to test the effects of these availability policies on your applications. For example, you might simulate a maintenance event on your instances in one of the following situations:
- You have instances that are configured to live migrate during maintenance events and you need to test the effects of live migration on your applications.
- You have batch jobs running on preemptible VM instances and you need to test how your applications handle preemption and shutdown of one or more instances.
- Your instances are configured to stop and restart during maintenance events rather than live migrate, and you need to test how your applications handle this shutdown and restart process.
Simulated maintenance events are subject to specific API Rate Limits.
You can simulate a maintenance event on an instance using either the
gcloud
command-line tool or an API request.
gcloud
Run the
instances simulate-maintenance-event
command to force an instance to activate its configured maintenance policy
action:
gcloud compute instances simulate-maintenance-event INSTANCE_NAME \
--zone ZONE
Replace the following:
INSTANCE_NAME
: the name of the instance where you want to simulate the maintenance event. You can specify multiple instance names separated by single spaces to simulate maintenance events on more than one instance in the same zone. For example,instance-1 instance-2 instance-3
.ZONE
: the zone where the instance is located.
API
In the API, make a request to the
compute.instances.simulateMaintenanceEvent
method in the
Compute Engine API:
POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instances/INSTANCE_NAME/simulateMaintenanceEvent
Replace the following:
PROJECT_ID
: the project ID for this requestINSTANCE_NAME
: the name of the instance where you want to simulate the maintenance eventZONE
: the zone where the instance is located
For more information about this method, see the
instances().simulateMaintenanceEvent
reference documentation.
What's next
- Learn more about live migration.
- Learn how to detect a live migration event.