This document describes how to set a virtual machine (VM) instance's host maintenance policy to control how the VM behaves when a host event occurs.
Before you begin
- If you want to use the command-line examples in this guide, do the following:
- Install or update to the latest version of the Google Cloud CLI.
- Set a default region and zone.
- If you want to use the API examples in this guide, set up API access.
Limitations
- You can't change the host maintenance policy of a preemptible VM. When there is a maintenance event, the preemptible VM stops and it does not migrate. You must manually restart the preempted VM.
- After you create a VM using an E2 machine type, you can't change the VM's host
maintenance settings from
MIGRATE
toTERMINATE
or the other way around.
Available host maintenance properties
You can configure a VM's maintenance behavior, restart behavior, and behavior after a host error occurs with the following properties.
Compute Engine configures each VM with the default values unless you specify otherwise.
During host events, depending on the configured host maintenance policy, VMs that do not support live migration are terminated or automatically restarted.
onHostMaintenance
: determines the behavior when a maintenance event occurs that might cause your VM to reboot.- [Default]
MIGRATE
: causes Compute Engine to live migrate an instance when there is a maintenance event. TERMINATE
: stops a VM instead of migrating it.
- [Default]
automaticRestart
: determines the behavior when a VM crashes or is stopped by the system.- [Default]
true
: Compute Engine restarts an instance if the instance crashes or is stopped. false
: Compute Engine does not restart a VM if the VM crashes or is stopped.
- [Default]
localSsdRecoveryTimeout
: Sets the Local SSD recovery timeout. This is the maximum amount of time, in hours, that Compute Engine waits to recover Local SSD data after a host error. This setting only applies to VMs with attached Local SSD disks.- [Default]: unset, and Compute Engine waits up to 1 hour to recover the disk.
- Number of hours from 0 to 168 (7 days), in increments of 1 hour. A value of 0 means that Compute Engine will not wait to recover the data.
hostErrorTimeoutSeconds
(Preview): Sets the maximum amount of time, in seconds, that Compute Engine waits to restart or terminate a VM after detecting that the VM is unresponsive.- [Default] unset, Compute Engine waits up to 5.5 minutes (330 seconds) before restarting an unresponsive VM.
- Number of seconds from 90 to 330, in increments of 30, which sets how long Compute Engine waits before restarting an unresponsive VM.
Set host maintenance policy of a VM
You can change the host maintenance policy of a VM when you first create the VM or after the VM is created.
Set host maintenance policy during VM creation
The information in this section focuses on how to set the host maintenance policy when you create a VM. For more VM creation examples, see Create and start a VM instance.
You can set the host maintenance policy of a VM at creation using the Google Cloud console, gcloud CLI or the Compute Engine API.
Console
In the Google Cloud console, go to the Create an instance page.
Expand the Advanced options section, and do the following:
- Expand the Management section.
- In the Automatic restart list, select the option that you want.
- In the On host maintenance list, select the option that you want.
To create the VM, click Create.
gcloud
In the Google Cloud console, activate Cloud Shell.
At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.
To set the host maintenance policy of a new VM, use the
gcloud compute instances create
command. Include one or more of the following
parameters:
--maintenance-policy
: whether the VM is migrated or stopped during host maintenance. The VM is migrated by default if you omit this property.--no-restart-on-failure
or--restart-on-failure
: whether the VM restarts automatically after a host error. By default, the VM will always restart when a failure is detected.--local-ssd-recovery-timeout
: how much time Compute Engine spends recovering any attached Local SSD disks after a host error. The default is 1 hour.
Set the host maintenance policy of a new VM with the following command. If you omit any of the flags, the flag's default is used.
gcloud compute instances create VM_NAME \
--maintenance-policy=MAINTENANCE_POLICY \
--RESTART_ON_FAILURE_BEHAVIOR \
--local-ssd-recovery-timeout=SSD_RECOVERY_TIMEOUT
Replace the following:
VM_NAME
: the VM name.MAINTENANCE_POLICY
: the maintenance policy for this VM, eitherTERMINATE
orMIGRATE
.RESTART_ON_FAILURE_BEHAVIOR
: Restart behaviour for the VM, set to either--no-restart-on-failure
or--restart-on-failure
.SSD_RECOVERY_TIMEOUT
: the number of hours to spend recovering a Local SSD attached to an unresponsive VM. Valid values are from 0 to 168, in increments of 1 hour.
Set the host error detection timeout
To set the maximum amount of time Compute Engine
waits to restart or terminate an unresponsive VM, you must use the
gcloud beta compute instances create
command because this feature is available in
Preview. Specify the timeout with
the --host-error-timeout-seconds
flag.
gcloud beta compute instances create VM_NAME \
--maintenance-policy=MAINTENANCE_POLICY \
--RESTART_ON_FAILURE_BEHAVIOR \
--local-ssd-recovery-timeout=SSD_RECOVERY_TIMEOUT \
--host-error-timeout-seconds=ERROR_DETECTION_TIMEOUT
Replace the following:
VM_NAME
: the VM name.MAINTENANCE_POLICY
: the maintenance policy for this VM, eitherTERMINATE
orMIGRATE
.RESTART_ON_FAILURE_BEHAVIOR
: Restart behaviour for the VM, set to either--no-restart-on-failure
or--restart-on-failure
.SSD_RECOVERY_TIMEOUT
: the number of hours Compute Engine spends recovering a Local SSD that was attached to an unresponsive VM. Valid values are from 0 to 168, in increments of 1 hour.ERROR_DETECTION_TIMEOUT
: the number of seconds Compute Engine waits before restarting an unresponsive VM, from 90 to 330, in increments of 30.
API
To set the host maintenance policy of a new VM using the
Compute Engine API, use the
instances.insert
method.
Include one or more of the following properties in the scheduling
object of
the request body:
onHostMaintenance
: whether the VM is migrated or stopped during host maintenance. The VM is migrated by default.automaticRestart
: whether the VM restarts automatically after a host error. VMs are restarted automatically by default.localSsdRecoveryTimeout
: how much time Compute Engine spends recovering any attached Local SSD disks after detecting a host error. The default is 1 hour.
POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instances
{
"name": "VM_NAME",
"scheduling": {
"onHostMaintenance": "MAINTENANCE_POLICY",
"automaticRestart": "RESTART_POLICY,
"localSsdRecoveryTimeout": SSD_RECOVERY_TIMEOUT
}
}
Replace the following:
PROJECT_ID
: the project for the VM.ZONE
: the zone where you want to create the VM.VM_NAME
: the VM name.MAINTENANCE_POLICY
: the maintenance policy for this VM, eitherTERMINATE
orMIGRATE
.RESTART_POLICY
: the restart policy for this VM, eithertrue
orfalse
.SSD_RECOVERY_TIMEOUT
: the number of hours Compute Engine spends recovering a Local SSD disk that was attached to an unresponsive VM. Valid values are from 0 to 168, in increments of 1 hour.
Set the host error detection timeout
To set the maximum amount of time Compute Engine
waits to restart or terminate an unresponsive VM, use the
beta instances.insert
method
because this option is available in Preview.
Add the hostErrorTimeoutSeconds
property to the scheduling
object of the
request body.
POST https://compute.googleapis.com/compute/beta/projects/PROJECT_ID/zones/ZONE/instances
{
"name": "VM_NAME",
"scheduling": {
"onHostMaintenance": "MAINTENANCE_POLICY",
"automaticRestart": "RESTART_POLICY,
"localSsdRecoveryTimeout": SSD_RECOVERY_TIMEOUT
"hostErrorTimeoutSeconds": HOST_ERROR_TIMEOUT,
}
}
Replace the following:
PROJECT_ID
: the project for the VM.ZONE
: the zone where you want to create the VM.VM_NAME
: the VM name.MAINTENANCE_POLICY
: the maintenance policy for this VM, eitherTERMINATE
orMIGRATE
.RESTART_POLICY
: the restart policy for this VM, eithertrue
orfalse
.SSD_RECOVERY_TIMEOUT
: the number of hours Compute Engine to spend recovering a Local SSD disk that was attached to an unresponsive VM. Valid values are from 0 to 168, in increments of 1 hour.HOST_ERROR_TIMEOUT
: the number of seconds Compute Engine waits before restarting or terminating an unresponsive VM. Valid values are from 90 to 330, in increments of 30.
Update the host maintenance policy of an existing VM
Console
In the Google Cloud console, go to the VM instances page.
Click the VM for which you want to change settings. The VM details page displays.
On the VM details page, complete the following steps:
- Click the Edit button at the top of the page.
- Under Availability policies, update the policy as needed. From the Availability policies section, you can set the On host maintenance and Automatic restart options.
- Click Save.
gcloud
In the Google Cloud console, activate Cloud Shell.
At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.
Update the host maintenance policy of an existing VM with the
gcloud compute instances set-scheduling
command. Use the same parameters described in the VM creation command
in the preceding section.
gcloud compute instances set-scheduling VM_NAME \
--maintenance-policy=MAINTENANCE_POLICY \
--RESTART_ON_FAILURE_BEHAVIOR \
--local-ssd-recovery-timeout=SSD_RECOVERY_TIMEOUT
Replace the following:
VM_NAME
: the VM name.MAINTENANCE_POLICY
: the policy for this VM, eitherTERMINATE
orMIGRATE
.RESTART_ON_FAILURE_BEHAVIOR
: restart behaviour for the VM, either--no-restart-on-failure
or--restart-on-failure
.SSD_RECOVERY_TIMEOUT
: the time, in hours, Compute Engine spends recovering a Local SSD disk attached to an unresponsive VM. Valid values are from 0 to 168.
Update the host error detection timeout
To update the maximum amount of time Compute Engine
waits to restart or terminate an unresponsive VM, use the
gcloud beta compute instances set-scheduling
command, because this feature is only
available in Preview.
Update the timeout with the --host-error-timeout-seconds
parameter.
For example:
gcloud beta compute instances set-scheduling VM_NAME \
--maintenance-policy=MAINTENANCE_POLICY \
--RESTART_ON_FAILURE_BEHAVIOR \
--local-ssd-recovery-timeout=SSD_RECOVERY_TIMEOUT \
--host-error-timeout-seconds=NUMBER_OF_SECONDS
Replace the following:
VM_NAME
: the VM name.MAINTENANCE_POLICY
: the maintenance policy for this VM, eitherTERMINATE
orMIGRATE
.RESTART_ON_FAILURE_BEHAVIOR
: Restart behaviour for the VM, set to either--no-restart-on-failure
or--restart-on-failure
.SSD_RECOVERY_TIMEOUT
: the time, in hours, Compute Engine spends recovering a Local SSD disk that was attached to an unresponsive VM. Valid values are from 0 to 168.NUMBER_OF_SECONDS
: the number of seconds Compute Engine waits before restarting or terminating an unresponsive VM, from 90 to 330, in increments of 30.
API
Update the host maintenance policy of an existing VM with a POST
request
to the
instances.setScheduling
method.
Include one or more of the following properties in the request body:
onHostMaintenance
: whether the VM is migrated or stopped during host maintenance. The VM is migrated by default.automaticRestart
: whether the VM restarts automatically after a host error. VMs are restarted automatically by default.localSsdRecoveryTimeout
: how much time Compute Engine spends recovering any attached Local SSD disks after detecting a host error. If omitted, the default is 1 hour.
POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instances/VM_NAME/setScheduling
{
"onHostMaintenance": "MAINTENANCE_POLICY",
"automaticRestart": RESTART_POLICY,
"localSsdRecoveryTimeout": SSD_RECOVERY_TIMEOUT
}
Replace the following:
PROJECT_ID
: the project for the VM.ZONE
: the zone where the VM is located.VM_NAME
: the VM name.MAINTENANCE_POLICY
: the maintenance policy for this VM, eitherTERMINATE
orMIGRATE
.RESTART_POLICY
: the restart policy for this VM, eithertrue
orfalse
.SSD_RECOVERY_TIMEOUT
: the time, in hours, that Compute Engine spends recovering a Local SSD disk that was attached to an unresponsive VM. Valid values are from 0 to 168.
Update the host error detection timeout
To update the maximum amount of time Compute Engine
waits to restart or terminate an unresponsive VM, you must use the beta instances.setScheduling
method because this feature is
available in Preview.
Add the hostErrorTimeoutSeconds
parameter to the request body.
POST https://compute.googleapis.com/compute/beta/projects/PROJECT_ID/zones/ZONE/instances/VM_NAME/setScheduling
{
"hostErrorTimeoutSeconds": NUMBER_OF_SECONDS,
}
Replace the following:
PROJECT_ID
: the project for the VM.ZONE
: the zone where the VM is located.VM_NAME
: the VM name.NUMBER_OF_SECONDS
: the number of seconds Compute Engine waits before restarting or terminating an unresponsive VM, from 90 to 330, in increments of 30.
View host maintenance policy settings of a VM
Console
Go to the VM instances page.
Click the Name of the VM for which you want to view settings. The VM instance details page opens.
Go to the Management section. The Availability policies subsection shows your current settings for On host maintenance and Automatic restart.
gcloud
View the host maintenance option settings for a VM with the
gcloud compute instances describe
command:
gcloud compute instances describe VM_NAME --format="yaml(scheduling)"
Replace VM_NAME
with the VM name.
The output includes the VM's host error detection timeout, for example:
scheduling:
automaticRestart: true
localSsdRecoveryTimeout:
nanos: 0
seconds: '10800'
onHostMaintenance: MIGRATE
preemptible: false
provisioningModel: STANDARD
View the host error detection timeout setting
View the current value of the hostErrorTimeoutSeconds
with the
gcloud beta compute instances describe
command,
because this option is only available in Preview.
gcloud beta compute instances describe VM_NAME --format="yaml(scheduling)"
Replace VM_NAME
with the VM name.
The output includes the VM's host error detection timeout, for example:
scheduling:
automaticRestart: true
hostErrorTimeoutSeconds: 120
localSsdRecoveryTimeout:
nanos: 0
seconds: '10800'
onHostMaintenance: MIGRATE
preemptible: false
provisioningModel: STANDARD
API
To view the host maintenance settings for a VM, use the
instances.get
method:
GET https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instances/VM_NAME
Replace the following:
PROJECT_ID
: the project where the VM is located.ZONE
: the zone where the VM is located.VM_NAME
: the VM name.
In the output, the scheduling
object contains the VM's host maintenance policy,
for example:
"scheduling": {
"onHostMaintenance": "MIGRATE",
"automaticRestart": true,
"preemptible": false,
"provisioningModel": "STANDARD",
"localSsdRecoveryTimeout": {
"seconds": "10800",
"nanos": 0
}
}
View the host error timeout settings
View the current hostErrorTimeoutSeconds
setting with
a GET
request to the
beta instances.get
method,
because this option is
only available in Preview.
GET https://compute.googleapis.com/compute/beta/projects/PROJECT_ID/zones/ZONE/instances/VM_NAME
Replace the following:
PROJECT_ID
: the project for the VM.ZONE
: the zone where the VM is located.VM_NAME
: the VM name.
In the output, the scheduling
object includes the VM's host error detection
timeout, for example:
"scheduling": {
"hostErrorTimeoutSeconds": 120
}
What's next
- Learn more about host maintenance.
- Learn more about live migration.
- Learn how to detect a live migration event.