Manually live migrate sole-tenant VMs


For Google Cloud projects that require dedicated hardware or additional control of VMs, Compute Engine provides sole-tenancy. Sole-tenancy lets you reserve sole-tenant nodes, which are servers that run VMs only from your projects. To schedule a VM on a sole-tenant node, you use the VM's nodeAffinities property.

With sole-tenant VMs, because you have control over the entire host, you can specify the following:

  • The window of time when maintenance occurs. For more information, see Maintenance windows.

  • How Compute Engine manages underlying servers during maintenance events. For more information, see Maintenance policies.

  • When to manually live migrate VMs from one host to another. This document describes manual live migration and shows how to manually live migrate VMs.

Manual live migration of sole-tenant VMs

Compute Engine automatically enables live migration for sole-tenant and multi-tenant VMs so they can continue to run when the underlying host undergoes a maintenance event. With the additional control afforded by using sole-tenancy, you can manually live migrate sole-tenant VMs.

To manually live migrate a sole-tenant VM, update the value of the nodeAffinities property by using the instances.Update method. When you update this property, you specify the sole-tenant node group or sole-tenant node to manually live migrate the VM to.

During live migration of a VM within sole-tenancy, the VM consumes capacity from both the source sole-tenant node and the destination sole-tenant node until live migration completes. If there is not enough capacity on the destination host, Compute Engine does not move the VM.

The following table shows the supported VM sources and destinations for manual live migration.

VM source VM destination
Sole-tenant node group Sole-tenant node group1
Sole-tenant node
Sole-tenant node Sole-tenant node group1
Sole-tenant node
Sole-tenant node group
Sole-tenant node
Multi-tenant host2
1 If the destination is a node group, Compute Engine schedules the VM onto the node that has enough space for the VM and that has the least amount of spare capacity. If you want to instead schedule the VM onto a specific node, you can specify the name of the node. For more information, see Node affinity and anti-affinity.
2 You cannot live migrate a sole-tenant VM to a multi-tenant host. For information about how to move a VM from sole-tenancy to multi-tenancy, see Updating VM tenancy.

Use cases

Increase utilization
Schedule VMs on separate sole-tenant nodes to increase utilization.
Logically reorganize VMs
Use different sole-tenant node groups or nodes to separate VMs based on the type of their workload.
Improve performance by rebalancing oversubscribed sole-tenant nodes
Manually live migrate inadequately performing VMs that are scheduled on oversubscribed sole-tenant nodes to other nodes. This is particularly useful if you are overcommitting CPUs.
Isolate workloads to meet compliance standards or improve performance
Manually live migrate multi-tenant workloads that require hardware isolation into sole-tenancy to meet compliance standards or to improve performance.
Increase portability of VMs
You can't modify certain node template settings, such as the maintenance policy, the maintenance window, and settings related to local SSD. By using manual live migration, you can migrate VMs to a node group with different settings.
Optimize sole-tenant node costs
By manually live migrating VMs, you might be able to consolidate them onto fewer sole-tenant nodes.

Examples

To understand how manual live migration supports these use cases, review the following examples of how manually live migrating a VM from one host to another can help optimize your workloads.

Manual bin packing

Consider a sole-tenant node group with the following initial state, on which you are trying to schedule an additional VM with 16 vCPUs:

Initial state Node 1 Node 2 Total
vCPU capacity 80 80 160
VM vCPUs 72 64, 8 144
Unused capacity 8 8 16

There is not enough space on any node to schedule a VM with 16 vCPUs. There is though, enough aggregate space.

To make space for the 16 vCPU VM, initiate a live migration of the 8 vCPU VM from Node 2 to Node 1. The following table shows the new VM configuration:

Final state Node 1 Node 2 Total
vCPU capacity 80 80 160
VM vCPUs 72, 8 64, 16 160
Unused capacity 0 0 0

The following figure summarizes this process:

Manual bin packing of VMs to make room for a larger VM.
Figure 1: Manual bin packing of VMs to make room for a larger VM.

Autoscaling after bin packing

Consider a sole-tenant node group with the following initial state, on which if you move the 8 vCPU VM, the node group autoscaler can remove a node:

Initial state Node 1 Node 2 Total
vCPU capacity 80 80 160
VM vCPUs 8 72 80
Unused capacity 72 8 80

To notify the node group autoscaler of an empty node, initiate a live migration of the 8 vCPU VM from Node 1 to Node 1. The following table shows the new VM configuration:

Final state Node 1 Node 2 Total
vCPU capacity 80 80 160
VM vCPUs 0 72, 8 80
Unused capacity 80 0 80

Now that Node 1 is empty, the autoscaler can remove it from the node group. The following table shows the new VM configuration:

Final state Node 1 Node 2 Total
vCPU capacity - 80 80
VM vCPUs - 72, 8 80
Unused capacity - 0 80

The following figure summarizes this process:

Manual bin packing of VMs to consolidate on one node. Then, the autoscaler can remove the empty node.
Figure 2: Manual bin packing of VMs to consolidate nodes.

Limitations

General limitations
Request for live migrating VMs are best effort and might fail if there are incompatible scheduling properties or other competing live migration requests.
Managed instance group (MIG) limitations
You cannot manually live migrate VMs that are in MIGs to another sole-tenant node.
Maintenance policy limitations

You cannot manually live migrate a VM into or out of a sole-tenant node group if the node group's maintenance policy is set to Migrate within node group.

You cannot manually live migrate a VM onto a specific node within a node group if the node group's maintenance policy is Migrate within node group.

You cannot manually live migrate a VM that has onHostMaintenance=Migrate into a node group that has the Restart in place maintenance policy because that maintenance policy requires onHostMaintenance=Terminate.

VM instance life cycle limitations

You cannot manually live migrate a VM from sole-tenancy to multi-tenancy. To move a VM out of sole-tenancy, you must stop the VM, clear the node affinity labels, and then restart the VM. For more information, see Updating VM tenancy.

You cannot update some properties of a VM, such as the size, without restarting the VM. Also, you cannot update these fields at the same time as you update node affinities. For information about these fields, see Updating instance properties.

Pricing

There are no additional charges for manually live migrating sole-tenant VMs. For more information about how you are billed for sole-tenant nodes, see Sole-tenant node pricing.

Manually live migrating sole-tenant VMs might lower your charges if after the migration the sole-tenant node is empty and you have enabled the sole-tenant node autoscaler. For more information about the sole-tenant node autoscaler, see Autoscaling node groups.

API rate limits

For the API rate limit, requests for manual live migration are categorized as Queries.

Manually live migrate sole-tenant VMs

Manually live migrate sole-tenant VMs by using the gcloud tool or the Compute Engine API.

Permissions required for this task

To perform this task, you must have the following permissions:

  • compute.instances.update permissions on the VM.

gcloud

Manually live migrate sole-tenant VMs by using the following gcloud beta compute instances update command.

gcloud beta compute instances update VM_NAME \
    ( --node=NODE \
      --node-group=NODE_GROUP \
      --node-affinity-file=NODE_AFFINITY_FILE )

Replace the following:

  • VM_NAME: the name of the VM to update the node affinity labels for.

Replace exactly one of the following:

  • NODE: the name of the node to live migrate the VM to.

  • NODE_GROUP: the name of the node group to live migrate the VM to.

  • NODE_AFFINITY_FILE: the name of a JSON file containing a configuration of nodes on which this VM could be scheduled. For more information, see Configure node affinity labels.

API

Manually live migrate sole-tenant VMs by using the following instances.update method.

PUT https://compute.googleapis.com/compute/beta/projects/PROJECT_ID/zones/ZONE/instances/VM_NAME

{
  ...
  "scheduling": {
    "nodeAffinities": [
      {
        "key": "NODE_AFFINITY_LABEL_KEY",
        "operator": "IN",
        "values": [
          "[NODE_AFFINITY_LABEL_VALUE]"
        ]
      }
    ]
  }
  ...
}

Replace the following:

  • PROJECT_ID: the ID of the project that has the VM to update the node affinity labels for.

  • ZONE: the zone of the VM to update the node affinity labels for.

  • VM_NAME: the name of the VM to update the node affinity labels for.

  • NODE_AFFINITY_LABEL_KEY: one of the following strings that specifies whether to live migrate the VM to a node group or node:

    VM destination Key to specify
    Node group compute.googleapis.com/node-group-name
    Node compute.googleapis.com/node-name
  • NODE_AFFINITY_LABEL_VALUE: the name of the node group or node to live migrate the VM to.

What's next