This document is for platform administrators that run virtual machines (VMs) in Google Distributed Cloud. This document shows you how to configure the eviction policy that controls how VMs that use VM Runtime on Google Distributed Cloud can automatically migrate to other hosts during maintenance events. These configuration settings let you VMs continue to run on other hosts when you upgrade a cluster or perform maintenance on a host.
Before you begin
To complete this document, you need access to Google Distributed Cloud version 1.12.0
(anthosBareMetalVersion: 1.12.0
) or higher cluster. You can use any cluster
type capable of running workloads. If needed,
try Google Distributed Cloud on Compute Engine
or see the
cluster creation overview.
Eviction policy settings
Google Distributed Cloud lets you configure node behavior when you perform cluster upgrades or place nodes into maintenance mode. VM Runtime on Google Distributed Cloud uses an eviction policy that controls how the cluster handles VMs that run during these events.
When enabled, VM Runtime on Google Distributed Cloud uses live migration to move VM workloads to another host by default. This behavior minimizes potential disruption to the VM workloads. With live migration, migratable VMs move from one host to another without disruption to the guest OS.
The following eviction policy settings can be configured:
Property | Default | Valid values |
---|---|---|
evictionStrategy |
LiveMigrate |
LiveMigrate , Restart |
maxMigrationAttemptsPerVM |
3 |
Integers in the range [1 ,5 ] |
migrationTargetInitializationTimeout |
30s |
Any valid duration string, such as 1m , 2h , or 1h30m |
evictionStrategy
If this value is set to LiveMigrate
, VM Runtime on Google Distributed Cloud attempts to
migrate any migratable VM away from a node to evict it. The VM is placed on
another node in the cluster and continues to run.
If this value is set to Restart
, VM Runtime on Google Distributed Cloud makes no attempt
to migrate VMs. Instead, VMs are stopped when the node is put into maintenance
mode, and the VMs start again when the node is brought out of maintenance mode.
The VM is unavailable during the host maintenance event.
When you choose the LiveMigrate
strategy, VM Runtime on Google Distributed Cloud falls
back to the Restart
strategy for non-migratable VMs.
maxMigrationAttemptsPerVM
This setting controls the maximum number of migration attempts that can occur
for a particular VM before the Restart
strategy applies.
This setting is ignored if evictionStrategy
is set to Restart
.
migrationTargetInitializationTimeout
This setting controls the maximum amount of time a migration can take to start before it's considered to have failed. A migration that takes too long to start counts toward the failed migration budget for a VM workload.
This setting is ignored if evictionStrategy
is set to Restart
.
Enable and configure the eviction policy
To use the eviction policy in VM Runtime on Google Distributed Cloud, the evictionPolicy
setting must be defined in your cluster's VMRuntime
object. This setting is
enabled by default in Anthos clusters on bare metal version 1.12 and higher. If
you upgrade from a previous version, you must enable this setting manually.
An empty object, such as evictionPolicy: {}
, enables the eviction policy with
default settings. To change the default behavior, add additional configuration
options to the evictionPolicy
object.
To enable and configure the eviction policy, complete the following steps:
Edit the
VMRuntime
custom resource:kubectl edit vmruntime vmruntime
Add or update the
evictionPolicy
section with an empty object to use the default eviction policy settings, or add your desired settings:apiVersion: vm.cluster.gke.io/v1 kind: VMRuntime metadata: name: vmruntime spec: disableCDIUploadProxyVIP: false enabled: true evictionPolicy: {} useEmulation: false
If the
evictionPolicy
isn't populated, such asevictionPolicy: null
or isn't included in the manifest, no special action is taken. VMs are forcefully evicted.The following example manifests show how to enable live migration of VMs, or restart the VMs on the same host after maintenance is complete:
Live migrate
The following example eviction policy enables live migration of VMs. The cluster can retry the live migration up to 3 times, with a 30-second timeout on the process each time to start:
apiVersion: vm.cluster.gke.io/v1 kind: VMRuntime metadata: name: vmruntime spec: disableCDIUploadProxyVIP: false enabled: true evictionPolicy: evictionStrategy: LiveMigrate maxMigrationAttemptsPerVM: 3 migrationTargetInitializationTimeout: 30s useEmulation: false
Restart
The following example eviction policy makes no attempt at migration for any VMs during eviction. VMs are stopped on their current host, and start again when the host finishes its maintenance event:
apiVersion: vm.cluster.gke.io/v1 kind: VMRuntime metadata: name: vmruntime spec: disableCDIUploadProxyVIP: false enabled: true evictionPolicy: evictionStrategy: Restart useEmulation: false
Save and close the
VMRuntime
custom resource in your editor.VM Runtime on Google Distributed Cloud now applies the eviction policy to VMs in your cluster.