Updating Managed Instance Groups

This page describes how to make updates to your managed instance groups. To learn about managed instance groups, read the Instance Groups documentation.

A managed instance group contains one or more virtual machine instances that are controlled using an instance template. To update instances in a managed instance group, you can make update requests to the group as a whole, using the Instance Group Updater feature.

Use the Instance Group Updater feature to perform rolling updates across all the instances in the group, or perform canary updates so that the update is rolled out to only a select number of instances for testing and monitoring before you commit to a full update. You also have flexibility over each update by tweaking the options for the update request.

Before you begin

Alpha restrictions

  • Currently, Instance Group Updater does not support regional managed instance groups.
  • During Alpha, you can perform an update in the API or in the gcloud command-line tool but not in the Google Cloud Platform Console.

Starting a basic rolling update

A rolling update is an update that is gradually applied to all instances in an instance group until all instances have been updated. You can control various aspects of a rolling update, such as how many instances can be taken offline for the update, how long to wait between updating instances, whether the update affects all or just a portion of instances, and so on.

Here are things to keep in mind when making a rolling update:

  • Updates are intent based. When you make the initial update request, the API returns a successful response to confirm that the request was valid but that does not indicate that the update was successful. You will need to check the status of the managed instance group to determine if your update was deployed successfully.

  • The Instance Group Updater API is a declarative API. The API expects a request to specify the desired post update configuration of the managed instance group, rather than an explicit function call.

  • The Updater feature supports up to two instance template versions in your managed instance group. This means that you can specify two different instance template versions for your managed instance group, which is useful for performing canary updates.

To start a basic rolling update where the update is applied to 100% of the instances in the group, follow the instructions below.

gcloud

Using the gcloud tool, run the rolling-action start-update command:

gcloud alpha compute instance-groups managed rolling-action start-update [INSTANCE_GROUP] \
    --version template=[INSTANCE_TEMPLATE] --zone [ZONE]

where:

  • [INSTANCE_GROUP] is the name of the instance group to update.
  • [INSTANCE_TEMPLATE] is the new instance template to update the instance group to.
  • [ZONE] is the zone of the managed instance group.

API

In the API, make a PATCH request to the following URL:

https://www.googleapis.com/compute/alpha/projects/[PROJECT_ID]/zones/[ZONE]/instanceGroupManagers/[MANAGED_INSTANCE_GROUP_NAME]

The request payload contains:

  • The instance template to update the group to.
  • An update policy for the request and other update options.

The following is the minimal configuration necessary for initiating an update in the API. If you don't specify otherwise, the maxSurge and maxUnavailable properties will be set to 1, which means that the Updater will only make 1 instance unavailable at any given time, and will only create 1 additional instance above the target size of the instance group during the update. The request will update 100% of instances to the new instance template.

For example:

{
  "instanceTemplate": "global/instanceTemplates/example-template",
  "updatePolicy": {
    "type": "proactive"
   }
 }

After you make a request, you can monitor the update to know when the update has finished.

Configuring options for your update

For more complex updates, you can configure additional options for a specific update request. These options are described below.

Max surge

Set the maxSurge property to allow the Updater to temporarily create new instances above the targetSize during the update. For example, if you set maxSurge to 5, the managed instance group will create up to 5 new instances above your target size, with the new instance template. Setting a higher maxSurge value will speed up your update, at the cost of additional instances, which are billed according to the Compute Engine price sheet. If you do not set the maxSurge value, the default value of 1 max surge instance is used.

This option is recognized only when configured with the REPLACE minimal action, but is not supported with the RESTART action setting. You can specify either a fixed number, or specify a percentage if the managed instance group has 10 or more instances. If you set a percentage, the updater will round up the number of instances if necessary.

maxSurge will only work if you have enough quota or resources to support the additional resources.

Max unavailable

Set the maxUnavailable configuration so that only a certain number of instances are unavailable at any time during the update. For example, if you set maxUnavailable to 5, then only 5 instances will be taken offline for updating at a time. Use this parameter to control how disruptive the update is to your service and to control the rate at which the update is deployed.

This number also includes any instances that are unavailable for other reasons. For example, if the instance group is in the process of being resized up, instances in the middle of being created might be unavailable; these instances would count towards the maxUnavailable number. You can specify either a fixed number, or specify a percentage if the managed instance group has 10 or more instances. If you set a percentage, the updater will round down the number of instances, if necessary.

If not specified, the default value of 1 max unavailable instance is used.

Minimum wait time

Set the minReadySeconds property to control the amount of time the Updater service should wait before considering a newly created or restarted instance as updated. The timer starts after a successful health check. Use this feature to control the rate at which the update is deployed. The maximum value for the minReadySeconds property is 3600 seconds (1 hour).

Minimal action

Set the minimal action property to control the minimum action that Updater must perform to update the instances in the group. For example, if you set REPLACE as the minimum action, then all instances that are affected will be deleted and recreated, whether or not it is necessary.

Setting a minimal action guarantees that the updater will perform that action, at a minimum. However, if the updater determines that the minimal action you specify is not enough to perform the update, it might perform a more disruptive action. For example, if you set RESTART as the minimal action, the updater will attempt to restart instances to apply the update. However, if the update requires a more disruptive action, the updater will perform that action. Changing an instance's OS, for example, cannot be done by restarting the instance so the updater will delete and recreate the instances in the instance group.

Applicable actions are REPLACE or RESTART:

  • RESTART: Restarts the instance (performs a stop and start request). Note that if your update request requires that the instance be replaced to pick up the changes (for example, changing the image would require that the instance is deleted and recreated), will be forced to perform REPLACE.

  • REPLACE: Deletes the existing instance and creates a new instance from the target template.

The following diagram visualizes how these options affect your instances.

Diagram explaining how different updater options affect your request

Additional update examples

Here are some command-line examples with common configuration options.

Perform a rolling update of all virtual machine instances, but create up to 5 new instances above the target size at a time

gcloud alpha compute instance-groups managed rolling-action start-update [INSTANCE_GROUP_NAME] \
    --version template=[NEW_TEMPLATE] --max-surge 5 --zone [ZONE]

Perform a rolling update with at most 3 unavailable machines and a minimum wait time of 3 minutes before marking a new instance as available

gcloud alpha compute instance-groups managed rolling-action start-update [INSTANCE_GROUP_NAME] \
    --version template=[NEW_TEMPLATE] --min-ready 3m \
    --max-unavailable 3 --zone [ZONE]

For example, if you have 1000 instances and you ran this command, the updater will create up to 100 new instances before it starts to remove instances running the previous instance template.

Perform a rolling update of all virtual machine instances, but create up to 10% new instances above the target size at a time

gcloud alpha compute instance-groups managed rolling-action start-update [INSTANCE_GROUP_NAME] \
    --version template=[NEW_TEMPLATE] --max-surge 10% --zone [ZONE]

Starting a canary update

The Instance Group Updater feature allows you to perform canary updates, so that you can test your updates on a random subset of instances before fully committing to the update.

A canary update is an update that is applied to a partial number of instances in the instance group. Canary updates let you test new features or upgrades on a subset of instances, instead of rolling out a potentially disruptive update to all your instances. If an update is not going well, you only need to roll back a small number of instances, minimizing the disruption for your users. From the perspective of the server, a canary update is the same as a standard rolling update, except that the number of instances that should be updated is less than the total size of the instance group.

For a canary update:

  • You can specify up to two instance template versions. For example, you can specify that 20% of your instances are created based on new-instance-template while the rest of the instances continue to run on old-instance-template. You cannot specify more than two instance templates at a time.

  • You must always specify a target size (targetSize) for the canary version. You cannot specify two instance template versions and omit the target size for the canary version.

To start a canary update, make an update request with the new instance template and provide a targetSize for the new template to indicate how many instances should be updated. You can express targetSize as either a fixed number or a percentage if the managed instance group has 10 or more instances. If you set a percentage, the number of instances is rounded up, if necessary.

gcloud

Using the gcloud command-line tool, provide both the current template and the new template to explicitly express how many instances should use each template:

gcloud alpha compute instance-groups managed rolling-action start-update [INSTANCE_GROUP_NAME] \
    --version template=[CURRENT_TEMPLATE] \
    --canary-version template=[NEW_TEMPLATE],target-size=[SIZE] \
    --zone [ZONE]

where:

  • [CURRENT_TEMPLATE] is the current template that the instance group is running.
  • [NEW_TEMPLATE] is the new template you want to canary.
  • [SIZE] is the number or percentage of instances you want to apply this update to. You must apply the target-size property to the --canary-version template. You can only set a percentage if the instance group contains 10 or more instances.
  • [ZONE] is the zone of the managed instance group.

For example, the following command performs a canary update that rolls out my-template-b to 10% of instances in the instance group:

gcloud alpha compute instance-groups managed rolling-action start-update my-ig1 \
        --version template=my-template-A --canary-version template=my-template-B,target-size=10%

API

In the API, make a PATCH request to the following URI:

https://www.googleapis.com/compute/alpha/projects/[PROJECT_ID]/zones/[ZONE]/instanceGroupManagers/[INSTANCE_GROUP_NAME]

The request payload should contain both the current instance template and the new instance template that you want to canary. For example:

{
 "versions": [
  {
   "instanceTemplate": "global/instanceTemplates/[NEW_TEMPLATE]",
   "targetSize": {
    "[percent|fixed]": [NUMBER|PERCENTAGE] # Use `fixed` for a specific number of instances
   }
  },
  {
   "instanceTemplate": "global/instanceTemplates/[CURRENT_TEMPLATE]"
  }
 ]
}

where:

  • [NEW_TEMPLATE] is the name of the new template you want to canary.
  • [NUMBER|PERCENTAGE] is the fixed number or percentage of instances to canary this update. You can only set a percentage if the instance group contains 10 or more instances. Otherwise, provide a fixed number instead.
  • [CURRENT_TEMPLATE] is the name of the current template that the instance group is running

Rolling forward a canary update

After running a canary update, you can decide if you want to commit the update to 100% of the instance group or roll back. If you want to commit to your canary update, roll forward the update by making the same update request but setting only version and omitting --canary-version. Using the gcloud command-line tool:

gcloud alpha compute instance-groups managed rolling-action start-update [INSTANCE_GROUP_NAME] \
    --version template=[NEW_TEMPLATE] --zone [ZONE]

In the API, make a PATCH request to the following URI:

https://www.googleapis.com/compute/alpha/projects/[PROJECT_ID]/zones/[ZONE]/instanceGroupManagers/[INSTANCE_GROUP_NAME]

In the request body, specify the new instance template as a version and omit the old instance template from your request body. Omit the target size specification to roll the update out to 100% of instances. For example, your request body would look like this:

{
"versions": [
   {
   "instanceTemplate": "global/instanceTemplates/[NEW_TEMPLATE]" # New instance template
   }
 ]
}

Replace [NEW_TEMPLATE] with the name of the new instance template you want to roll forward.

Starting an opportunistic or proactive update

By default, updates made using the gcloud command-line tool are proactive and updates initiated in the API are opportunistic.

For proactive updates, Compute Engine actively schedules actions to apply the requested updates to instances as necessary. In many cases, this often means deleting and recreating instances proactively.

Alternatively, you can choose to perform an opportunistic update, if a proactive update is potentially too disruptive. An opportunistic update is only applied when new instances are created by the managed instance group. This typically happens when the managed instance group is resized either by another service, such as an autoscaler, or manually by a user. Compute Engine does not actively initiate requests to apply updates.

In certain scenarios, an opportunistic update is useful because you don't want to cause instability to the system if it can be avoided. For example, if you have a non-critical update that can be applied as necessary without any urgency and you have a managed instance group that is actively being autoscaled, perform an opportunistic update so that Compute Engine does not actively tear down your existing instances to apply the update.

To choose whether an update is opportunistic or proactive, set the type property to OPPORTUNISTIC or PROACTIVE using either the gcloud command-line tool or the API.

gcloud

Using the gcloud command-line tool:

gcloud alpha compute instance-groups managed rolling-action start-update [INSTANCE_GROUP_NAME] \
    --version template=[INSTANCE_TEMPLATE] \
    --type [opportunistic|proactive] --zone [ZONE]

API

In the request payload to start an update, include the type property in your updatePolicy:

{
"updatePolicy": {
  "type": "PROACTIVE" # Performs a proactive update
},
"versions": [{
  "instanceTemplate": "global/instanceTemplates/[NEW_TEMPLATE]",
  }]
}

where [NEW_TEMPLATE] is the name of the new template you want to canary. If you want an opportunistic update, replace PROACTIVE with OPPORTUNISTIC.

Performing a rolling recreate or restart

Alternatively, you can use the restart or recreate commands to perform a rolling restart or a rolling recreate of VM instances in the managed instance group. Similar to the start-update command, you can specify any of the configuration options for a restart or a recreate.

To perform a simple rolling recreate:

gcloud compute instance-groups managed rolling-action recreate [INSTANCE_GROUP]

This command performs a rolling recreate of all instances in the managed instance group, one at a time, so each instance is deleted and recreated. If a recreate is too disruptive, you can specify a rolling restart instead, which does not delete any instances and just restarts each instance.

gcloud compute instance-groups managed rolling-action restart [INSTANCE_GROUP] 

You can further customize each of these commands with the same options available for updates (for example: maxSurge, maxUnavailable, min-ready).

Additional recreate/restart examples

Perform a rolling restart of all virtual machines, two at a time

This command restarts all virtual machines in the instance group, two at a time. Notice that no new instance template is specified.

gcloud alpha compute instance-groups managed rolling-action restart [INSTANCE_GROUP_NAME] \
    --max-unavailable 2 --zone [ZONE]

Rolling restart all VMs as quickly as possible

gcloud compute instance-groups managed rolling-action restart [INSTANCE_GROUP_NAME] \
    --max-unavailable 100% --zone [ZONE]

Rolling recreate all VMs as quickly as possible

gcloud compute instance-groups managed rolling-action recreate [INSTANCE_GROUP_NAME]  \
    --max-unavailable 100% --zone [ZONE]

Monitoring rolling updates

After you initiate a rolling update, it can take some time for the update to finish. You can monitor the state of the update by getting a list of instances that are part of the managed instance group. The Compute Engine API returns the instance status along with a list of instances.

gcloud

gcloud alpha compute instance-groups managed list-instances [INSTANCE_GROUP_NAME] --zone [ZONE]

gcloud returns a response with a list of instances in the instance group and their respective statuses. For example:

NAME               ZONE           STATUS   ACTION    INSTANCE_TEMPLATE  LAST_ERROR
vm-instances-9pk4  us-central1-f           CREATING  my-new-template
vm-instances-h2r1  us-central1-f           DELETING  my-old-template
vm-instances-j1h8  us-central1-f  RUNNING  NONE      my-old-template
vm-instances-ngod  us-central1-f  RUNNING  NONE      my-old-template

API

In the API, make a POST request to the following URI:

POST https://www.googleapis.com/compute/alpha/projects/[PROJECT_ID]/zones/[ZONE]/instanceGroupManagers/updater-ig/listManagedInstances

The API returns a list of instances for the group and their respective status and current action.

{
 "managedInstances": [
  {
   "instance": "https://www.googleapis.com/compute/alpha/projects/[PROJECT_ID]/zones/[ZONE]/instances/vm-instances-j701",
   "id": "4251656203855893170",
   "instanceStatus": "RUNNING",
   "instanceTemplate": "https://www.googleapis.com/compute/alpha/projects/[PROJECT_ID]/global/instanceTemplates/[INSTANCE_TEMPLATE_NAME]",
   "currentAction": "REFRESHING"
  },
  {
   "instance": "https://www.googleapis.com/compute/alpha/projects/[PROJECT_ID]/zones/[ZONE]/instances/vm-instances-prvp",
   "id": "5317605642920955957",
   "instanceStatus": "RUNNING",
   "instanceTemplate": "https://www.googleapis.com/compute/alpha/projects/[PROJECT_ID]/global/instanceTemplates/[INSTANCE_TEMPLATE_NAME]",
   "currentAction": "REFRESHING"
  },
  {
   "instance": "https://www.googleapis.com/compute/alpha/projects/[PROJECT_ID]/zones/[ZONE]/instances/vm-instances-pz5j",
   "currentAction": "DELETING"
  },
  {
   "instance": "https://www.googleapis.com/compute/alpha/projects/[PROJECT_ID]/zones/[ZONE]/instances/vm-instances-w2t5",
   "id": "2800161036826218547",
   "instanceStatus": "RUNNING",
   "instanceTemplate": "https://www.googleapis.com/compute/alpha/projects/[PROJECT_ID]/global/instanceTemplates/[INSTANCE_TEMPLATE_NAME]",
   "currentAction": "REFRESHING"
  }
 ]
}

Group status and pending and current actions

An instance's current status is described by the instanceStatus field of the instance. If the instance is undergoing some type of change, the currentAction field will also be populated to help you track the status of updates. When the instance is successfully updated, the instanceStatus field will reflect the instance's current state. To see a list of valid instanceStatus fields, see the documentation for Checking Instance Status.

If an instance is undergoing some type of change, the currentAction field will be populated with one of the following statuses. Otherwise, the currentAction field will be empty.

  • CREATING: The instance is in the process of being created.
  • CREATING_WITHOUT_RETRIES: The instance is being created without retries; if the instance fails to be created on the first try, the managed instance group will not try to recreate the instance again.
  • DELETING: The instance is in the process of being created.
  • RESTARTING: The instance is in the process of being restarted using the stop and start methods.
  • REFRESHING: The instance is being removed from its current target pools and being readded to the list of current target pools (this list might be the same or different from existing target pools).

At the managed instance group level, Compute Engine populates a field called pendingActions that describes the number of instances currently awaiting a specific action. For example, the pendingActions field might return a count of pendingActions that say:

CREATING: 3
DELETING: 3
RECREATING: 2
RESTARTING: 1

This indicates that there are three instances that are pending deletion, two instances that are going to be recreated, one instance that will be restarted, and three instances that will be created.

Rolling back an update

There is no explicit command for rolling back an update to a previous version but if you decide that you want to roll back an update (either a fully committed or canary update), you can do so by making a new update request and passing in the instance template that you want to roll back to.

For example, the following command rolls back an update as fast as possible:

gcloud alpha compute instance-groups managed rolling-action start-update [INSTANCE_GROUP_NAME] \
    --version template=[OLD_INSTANCE_TEMPLATE] --max-unavailable 100% --zone [ZONE]

Replace [OLD_INSTANCE_TEMPLATE] with the name of the old instance template you want to roll back to.

In the API, make a PATCH request to the following URI:

https://www.googleapis.com/compute/alpha/projects/[PROJECT_ID]/zones/[ZONE]/instanceGroupManagers/[INSTANCE_GROUP_NAME]

In the request body, specify the old instance template as a version:

{ "updatePolicy":
  {
    "maxUnavailable":
    {
      "percent": 100
    }
  },
 "versions": [
    {
      "instanceTemplate": "global/instanceTemplates/[OLD_TEMPLATE]" # Old instance template
    }
   ]
}

The Instance Group Updater service treats this as a regular update request so all of the update options described in this document can be specified with your request.

Controlling the speed of an update

By default, when you make an update request, the service performs the update as fast as possible. If you aren't sure you want to apply an update fully or are tentatively testing your changes, you can apply the update slowing using the following methods:

  1. Start a canary update rather than a full update.
  2. Set a large minReadySeconds value. Setting this value causes the service to wait this number of seconds before considering the instance successfully updated and proceeding to the next instance.
  3. Set a low maxUnavailable and maxSurge value. This ensures that only a minimal number of instances are updated at a time.

You can also use a combination of these parameters to control the rate of your update.

Stopping an update

There is no explicit method or command to stop an update. The gcloud tool provides a command to convert an update from proactive to opportunistic and if the managed instance group is not being resized by other services like autoscaler, the change to opportunistic will effectively "stop" the update.

To change an update from proactive to opportunistic, run the following command:

gcloud alpha instance-groups managed rolling-action stop-proactive-update [INSTANCE_GROUP_NAME] \
    --zone [ZONE]

If you decide you want to stop the update completely after converting it from proactive to opportunistic, you can stop it using these steps:

  1. Make a request to determine how many instances have been updated.

    gcloud alpha compute instance-groups managed list-instances [INSTANCE_GROUP_NAME] \
        --zone [ZONE]

    The gcloud tool returns a response with a list of instances in the instance group and their current statuses:

    NAME               ZONE           STATUS   ACTION    INSTANCE_TEMPLATE  LAST_ERROR
    vm-instances-9pk4  us-central1-f  RUNNING  NONE      my-new-template
    vm-instances-j1h8  us-central1-f  RUNNING  NONE      my-new-template
    vm-instances-ngod  us-central1-f  RUNNING  NONE      my-old-template
    

    In this example, two instances have already been updated.

  2. Next, make a request to perform a new "update" but pass in the number of instances that have already been updated as the target size:

    gcloud alpha compute instance-groups managed rolling-action start-update [INSTANCE_GROUP_NAME] \
        --version template=my-old-template \
        --canary-version template=my-new-template,target-size=2 \
        --zone [ZONE]

    To the Updater service, this update appears "complete", so no other instances are updated, effectively stopping the update.

Relationship between instanceTemplate properties for a managed instance group

A managed instance group could potentially have up to two instanceTemplate properties: a top-level instanceTemplate property, and an instanceTemplate for performing a canary update. For example, the following managed instance group has two instance template properties because it is in the middle of a canary update:

..[snip]..
instanceGroup: https://www.googleapis.com/compute/alpha/projects/myproject/zones/us-east1-b/instanceGroups/example-instance-group
instanceTemplate: https://www.googleapis.com/compute/alpha/projects/myproject/global/instanceTemplates/example-template
versions:
- instanceTemplate: https://www.googleapis.com/compute/alpha/projects/myproject/global/instanceTemplates/example-template-2
  targetSize:
    calculated: 3
    fixed: 3
- instanceTemplate: https://www.googleapis.com/compute/alpha/projects/myproject/global/instanceTemplates/example-template
zone: https://www.googleapis.com/compute/alpha/projects/myproject/zones/us-east1-b
..[snip]..

In general, the top-level instanceTemplate property reflects what the Updater considers the "current" template. So:

  • If an instance group does not have the versions field set, then the instanceTemplate property reflects the current instance template being used by the instance group manager.

  • If there is a versions field set, the instanceTemplate property reflects the non-canary template. That is, it reflects the template that has no targetSize set because the Updater assumes that the template with the target size set is a canary template that is still being tested.

Feedback and Questions

We welcome your feedback and questions! Please send your questions to the Alpha discussion group.

What's next

Send feedback about...

Compute Engine Documentation