Create a MIG with autoscaling enabled

This document describes how to create an autoscaled managed instance group (MIG) that automatically adds and removes VMs based on average CPU utilization across the group. For example, if the group's CPU utilization is low, the group automatically removes VMs to save on costs.

You can automatically scale a MIG based on various kinds of autoscaling signals. For more information, see the autoscaler overview.

You can also read about other basic scenarios for creating a MIG.

Before you begin

Limitations

To see the full list of MIG limitations, which varies based on the configuration that you use, see MIG limitations.

Create a MIG and enable autoscaling

Use the Cloud console , the gcloud CLI or the Compute Engine API.

Console

  1. In the console, go to the Instance groups page.

    Go to Instance groups

  2. If you have an instance group, select it and click Edit. If you don't have an instance group, click Create instance group.

  3. If no autoscaling configuration exists, under Autoscaling, click Configure autoscaling.

  4. Under Autoscaling mode, select On: add and remove instances to the group to enable autoscaling.

  5. Specify the minimum and maximum numbers of instances that you want the autoscaler to create in this group.

  6. In the Autoscaling metrics section, if an existing CPU utilization metric does not yet exist, add one:

    1. Click Add metric.
    2. Under Metric type, select CPU utilization.
    3. Enter the Target CPU utilization that you want. This value is treated as a percentage. For example, for 75% CPU utilization, enter 75.
    4. Click Done.
  7. Under Predictive autoscaling, select Off. To learn more about predictive autoscaling, and whether it is suitable for your workload, see Scaling based on predictions.

  8. You can use the Cool down period to tell the autoscaler how long it takes for your application to initialize. Specifying an accurate cool down period improves autoscaler decisions. For example, when scaling out, the autoscaler ignores data from VMs that are still initializing because those VMs might not yet represent normal usage of your application. The default cool down period is 60 seconds.

  9. Click Save.

gcloud

Before you can enable autoscaling, you must create a MIG. Follow the instructions to create a MIG with VMs confined to a single zone or create a MIG with VMs spread across multiple zones in a region.

Then use the set-autoscaling sub-command to enable autoscaling for the group. For example, the following command creates an autoscaler that has a target CPU utilization of 60%. Along with the --target-cpu-utilization parameter, the --max-num-replicas parameter is also required when creating an autoscaler:

gcloud compute instance-groups managed set-autoscaling example-managed-instance-group \
  --max-num-replicas 20 \
  --target-cpu-utilization 0.60 \
  --cool-down-period 90

You can use the --cool-down-period flag to tell the autoscaler how long it takes for your application to initialize. Specifying an accurate cool down period improves autoscaler decisions. For example, when scaling out, the autoscaler ignores data from VMs that are still initializing because those VMs might not yet represent normal usage of your application. The default cool down period is 60 seconds.

Optionally, you can enable predictive autoscaling to scale out ahead of predicted load. To learn whether predictive autoscaling is suitable for your workload, see Scaling based on predictions.

You can verify that autoscaling is successfully enabled by using the instance-groups managed describe command, which describes the corresponding MIG and provides information about any autoscaling features for that group:

gcloud compute instance-groups managed describe example-managed-instance-group

API

Before you can enable autoscaling, you must create a MIG with VMs confined to a single zone or create a MIG with VMs spread across multiple zones in a region.

If you have a zonal MIG, make a POST request to the autoscalers.insert method. If you have a regional MIG, use the regionAutoscalers.insert method.

For example:

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/autoscalers/

Your request body must contain the name, target, and autoscalingPolicy fields. The autoscalingPolicy field must define your target cpuUtilization value and maxNumReplicas value.

You can use the coolDownPeriodSec field to tell the autoscaler how long it takes for your application to initialize. Specifying an accurate cool down period improves autoscaler decisions. For example, when scaling out, the autoscaler ignores data from VMs that are still initializing because those VMs might not yet represent normal usage of your application. The default cool down period is 60 seconds.

Optionally, you can enable predictive autoscaling to scale out ahead of predicted load. To learn whether predictive autoscaling is suitable for your workload, see Scaling based on predictions.

{
  "name": "example-autoscaler",
  "target": "https://www.googleapis.com/compute/v1/projects/myproject/zones/us-central1-f/instanceGroupManagers/example-managed-instance-group",
  "autoscalingPolicy": {
    "maxNumReplicas": 10,
    "cpuUtilization": {
      "utilizationTarget": 0.6
    },
    "coolDownPeriodSec": 90
  }
}

For more information about enabling autoscaling based on CPU utilization, see Scaling based on CPU utilization.

What's next