Performing one-click OS image upgrades in MIGs

By using a combination of image families and rolling updates, you can enable one-click OS image upgrades on your managed instance group (MIG).

Using the one-click OS image upgrade provides a number of benefits, including:

  • Works with all VM machine types and all instance group sizes.
  • Supports both Windows and Linux images and containers.
  • Preserves custom startup scripts and metadata.
  • Instances are recreated based on their current instance template, or optionally, based on a new template.
  • The rollout of an update to the new OS version happens automatically, without the need for additional user input after the initial request.
  • Supports batch updates with an optional health check.

How does one-click OS image upgrade work?

When you invoke an update, the MIG replaces the boot disks for all VMs in the group with the latest available OS image version. All metadata and startup scripts that you set up in the instance template are preserved for each VM in the group. To limit application disruption, you can perform updates in batches, keeping a specific percent of VMs running during the update. To increase reliability, you can configure an application-based health check for your MIG: the group waits for a healthy response from an application on updated VMs before proceeding with further updates to other VMs.

Before you begin

  1. Install or update to the latest version of the gcloud command-line tool.

  2. Make sure you have created an instance template that points to an image family. You can use image families offered by Compute Engine or create your own custom image families with custom images.

    When your instance template points to an image family, the MIG always creates instances from the latest image in the family, for example:

    • When the MIG adds new instances because you or the MIG's autoscaler increased the MIG's size.
    • When the MIG recreates an instance, triggered manually or by autohealing.
  3. Check if a new image is available. You can use the gcloud compute images list command to see a list of images:

    gcloud compute images list --uri
    
  4. Test the new image with your app before rolling it out.

  5. Optionally, create an application-based health check for your MIG. An application-based health check verifies that your application is responding as expected on each of VMs in the MIG. You can configure your update to allow no more than one unavailable VM. If an application does not respond as expected, then the MIG marks that VM as unavailable, and your rollout does not proceed.

Performing one-click OS image upgrades for MIGs

To update all the VMs in a MIG to the latest image from an image family, follow these steps:

  1. Start a rolling replace with the following command.

    gcloud compute instance-groups managed rolling-action replace instance-group-name \
        [--max-surge=max-surge ] [--max-unavailable=max-unavailable]

    Replace the following:

    • instance-group-name: The name of the MIG to operate on.
    • max-surge: The maximum additional number of VMs that can be temporarily created during the update process. This can be a fixed number (for example, 5) or a percentage of the size of the MIG (for example, 10%).
    • max-unavailable: The maximum number of VMs that can be unavailable during the update process. This can be a fixed number (5) or a percentage of the size of the MIG (10%).

    You can combine health checks with the --max-unavailable and --max-surge options to stop further updates if they cause VMs to become unavailable.

  2. Monitor the update by using the wait-until subcommand to check that the MIG's status.versionTarget.isReached field is set to true.

    gcloud compute instance-groups managed wait-until instance-group-name --version-target-reached

    Replace the following:

    • instance-group-name: The name of the MIG to operate on.

    The command returns when the group is updated.

    You can also list instances to see each instance's status.

    gcloud beta compute instance-groups managed list-instances instance-group-name

    The command returns a list of instances and their details, including status, health state, and current actions for each VM. When all VMs are RUNNING and have no current action, then the MIG is up-to-date and stable.

  3. In case you need to roll back to a previous OS image, you must create an instance template and specify the image you want to use. Then start a rolling update to update all managed instances to use that template. For more information, see Rolling back an update.

Example

This example covers the following tasks:

  1. Create an instance template for easy OS image updates:
    • Specify an image family in the instance template.
    • Specify a start-up script that starts your app when each VM starts.
  2. Create a MIG based on the template.
  3. Set up a health check to limit disruption by an image update.
  4. Invoke the OS update with a single command.
  5. Monitor the update.

Use the following steps to enable and perform one-click OS upgrades on a MIG:

  1. Create an instance template that specifies an image-family and a metadata startup script to start your app. In the following example, the template specifies the debian-9 image family and a start-up script that gets and launches an app from GitHub. Each VM that the MIG creates based on this template uses the latest available image from this family and runs the script on startup.

    gcloud compute instance-templates create example-template \
        --machine-type n1-standard-4 \
        --image-family debian-9 \
        --image-project debian-cloud \
        --tags=http-server \
        --metadata startup-script='sudo apt-get update && sudo apt-get install git gunicorn3 python3-pip -y
    git clone https://github.com/GoogleCloudPlatform/python-docs-samples.git
    cd python-docs-samples/compute/managed-instances/demo
    sudo pip3 install -r requirements.txt
    sudo gunicorn3 --bind 0.0.0.0:80 app:app --daemon
    '
    
  2. Create a MIG based on the instance template. This example starts the MIG with three instances based on example-template. Because the instance template specifies an image family, the MIG creates each VM with the latest image from the family.

    gcloud compute instance-groups managed create example-group \
      --base-instance-name example \
      --size 3 \
      --zone us-east1-b \
      --template example-template
    
  3. Optional: Configure and enable an application-based health check. If your app doesn't respond after an image update, you can use the health check status combined with the maxUnavailable setting to stop the MIG from further rollouts.

    1. Create a health check that looks for an HTTP 200 response on the request path /health. The GitHub app that is on each instance serves that path.

      gcloud compute health-checks create http example-autohealer-check \
          --check-interval 10 \
          --timeout 5 \
          --healthy-threshold 2 \
          --unhealthy-threshold 3 \
          --request-path "/health"
      
    2. Create a firewall rule to allow the health checker probes to access the instances. The health checker probes come from addresses in the range: 130.211.0.0/22 and 35.191.0.0/16

      gcloud compute firewall-rules create default-allow-http-health-check \
          --network default \
          --allow tcp:80 \
          --source-ranges 130.211.0.0/22,35.191.0.0/16
      
    3. Add the health check to your MIG.

      gcloud compute instance-groups managed update example-group \
          --zone us-east1-b --health-check example-autohealer-check
      
  4. When a new image is available, invoke a rolling replace to replace all VMs in the MIG. The MIG will replace each VM with a new VM that has the latest image in the family, per the instance template. You can configure the level of disruption that the update causes. In this example, the MIG creates one additional VM above the group's target size, and it does not remove any existing VMs until that one VM is up and running.

    gcloud compute instance-groups managed rolling-action replace example-group \
        --max-surge 1 --max-unavailable 0
    
  5. If you want to monitor the status of the updates, use the wait-until command with the --version-target-reached flag. The command returns when the group is updated.

    gcloud compute instance-groups managed wait-until --version-target-reached example-group \
        --zone us-east1-
    Waiting for group to reach version target
    ...
    Version target is reached
    

    You can also use the list-instances command to see the status, health state, current actions, instance template, and version for each VM.

      gcloud beta compute instance-groups managed list-instances example-group 
    --zone us-east1-b NAME ZONE STATUS HEALTH_STATE ACTION INSTANCE_TEMPLATE VERSION_NAME LAST_ERROR test-211p us-east1-b RUNNING HEALTHY NONE example-template 0/2020-01-30 13:34:28.843377+00:00 test-t5qb us-east1-b RUNNING UNKNOWN VERIFYING example-template 0/2020-01-30 13:34:28.843377+00:00 test-x331 us-east1-b RUNNING HEALTHY NONE example-template 0/2020-01-20 20:39:51.819399+00:00

  6. If you need to rollback to a previous image, use the following steps:

    1. Create a new instance template that specifies the image that you want.
    2. Start a rolling update to apply the instance template.

What's next