This document describes how to create an autoscaled managed instance group (MIG) that automatically adds and removes VMs based on average CPU utilization across the group. For example, if the group's CPU utilization is low, the group automatically removes VMs to save on costs.
You can automatically scale a MIG based on various kinds of autoscaling signals. For more information, see the autoscaler overview.
You can also read about other basic scenarios for creating a MIG.
Before you begin
- Create an instance template, which is required in order to create a managed instance group.
-
If you haven't already, then set up authentication.
Authentication is
the process by which your identity is verified for access to Google Cloud services and APIs.
To run code or samples from a local development environment, you can authenticate to
Compute Engine by selecting one of the following options:
Select the tab for how you plan to use the samples on this page:
Console
When you use the Google Cloud console to access Google Cloud services and APIs, you don't need to set up authentication.
gcloud
-
Install the Google Cloud CLI, then initialize it by running the following command:
gcloud init
- Set a default region and zone.
Terraform
To use the Terraform samples on this page in a local development environment, install and initialize the gcloud CLI, and then set up Application Default Credentials with your user credentials.
- Install the Google Cloud CLI.
-
To initialize the gcloud CLI, run the following command:
gcloud init
-
If you're using a local shell, then create local authentication credentials for your user account:
gcloud auth application-default login
You don't need to do this if you're using Cloud Shell.
For more information, see Set up authentication for a local development environment.
REST
To use the REST API samples on this page in a local development environment, you use the credentials you provide to the gcloud CLI.
Install the Google Cloud CLI, then initialize it by running the following command:
gcloud init
For more information, see Authenticate for using REST in the Google Cloud authentication documentation.
-
Limitations
To see the full list of MIG limitations, which varies based on the configuration that you use, see MIG limitations.
Create a MIG and enable autoscaling
Use the Google Cloud console, the gcloud CLI, Terraform, or REST.
Console
-
In the console, go to the Instance groups page.
The remaining steps appear in the Google Cloud console.
-
If you have an instance group, select it and click Edit. If you don't have an instance
group, click
Create instance group . - For a new instance group, assign a name, then choose an instance template for the instance group or create a new one.
- If no autoscaling configuration exists, under Autoscaling, click Configure autoscaling.
-
Under
Autoscaling mode , select On: add and remove instances to the group to enable autoscaling. - Specify the minimum and maximum numbers of instances that you want the autoscaler to create in this group.
-
In the Autoscaling metrics section, if an existing CPU utilization metric does not
yet exist, add one:
-
Click
Add metric . -
Under
Metric type , select CPU utilization. -
Enter the Target CPU utilization that you want. This value is treated as a
percentage. For example, for 75% CPU utilization, enter
75
. -
Under Predictive autoscaling, select
Off . To learn more about predictive autoscaling, and whether it is suitable for your workload, see Scaling based on predictions. - Click Done.
-
Click
- You can use the Initialization period to set the initialization period, which tells the autoscaler how long it takes for your application to initialize. Specifying an accurate initialization period improves autoscaler decisions. For example, when scaling out, the autoscaler ignores data from VMs that are still initializing because those VMs might not yet represent normal usage of your application. The default initialization period is 60 seconds.
- To create the MIG, click
Create .
gcloud
Before you can enable autoscaling, you must create a MIG. Follow the instructions to create a MIG with VMs confined to a single zone or create a MIG with VMs spread across multiple zones in a region.
Then use the
set-autoscaling
sub-command to enable autoscaling for the group. For example,
the following command creates an autoscaler that has a target CPU
utilization of 60%. Along with the --target-cpu-utilization
parameter,
the --max-num-replicas
parameter is also required when creating an
autoscaler.
Optionally, you can set the --min-num-replicas
indicating the
minimum number of VMs that you want in the group. If you don't set the
minimum, by default, MIG sets this value to 2.
You can use the --cool-down-period
flag to set the initialization period, which tells the
autoscaler how long it takes for your application to initialize. Specifying an accurate
initialization period improves autoscaler decisions. For example, when scaling out, the
autoscaler ignores data from VMs that are still initializing because those VMs
might not yet represent normal usage of your application. The default initialization
period is 60 seconds.
gcloud compute instance-groups managed set-autoscaling example-managed-instance-group \ --max-num-replicas 20 \ --target-cpu-utilization 0.60 \ --cool-down-period 90
If you want, you can enable predictive autoscaling to scale out ahead of predicted load. To learn whether predictive autoscaling is suitable for your workload, see Scaling based on predictions.
You can verify that autoscaling is successfully enabled by using the
instance-groups managed describe
command,
which describes the corresponding MIG and provides information about
any autoscaling features for that group:
gcloud compute instance-groups managed describe example-managed-instance-group
Terraform
Before you can enable autoscaling, you must create a MIG. Follow the instructions to create a MIG with VMs confined to a single zone or create a MIG with VMs spread across multiple zones in a region.
To configure autoscaling in a MIG, you can use the google_compute_autoscaler
resource.
The following sample configures autoscaling based on CPU utilization in a zonal MIG.
To learn how to apply or remove a Terraform configuration, see Basic Terraform commands.
REST
Before you can enable autoscaling, you must create a MIG with VMs confined to a single zone or create a MIG with VMs spread across multiple zones in a region.
If you have a zonal MIG, make a POST
request to the
autoscalers.insert
method. If you have a regional MIG, use the
regionAutoscalers.insert
method.
For example:
POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/autoscalers/
Your request body must contain the name
, target
, and autoscalingPolicy
fields. The autoscalingPolicy
field must define your target
cpuUtilization
value and maxNumReplicas
value.
Optionally, you can set the minNumReplicas
indicating the
minimum number of VMs that you want in the group. If you don't set the
minimum, by default, MIG sets this value to 2.
You can use the coolDownPeriodSec
field to set the initialization period, which tells the
autoscaler how long it takes for your application to initialize. Specifying an accurate
initialization period improves autoscaler decisions. For example, when scaling out, the
autoscaler ignores data from VMs that are still initializing because those VMs
might not yet represent normal usage of your application. The default initialization
period is 60 seconds.
{ "name": "example-autoscaler", "target": "https://www.googleapis.com/compute/v1/projects/myproject/zones/us-central1-f/instanceGroupManagers/example-managed-instance-group", "autoscalingPolicy": { "maxNumReplicas": 10, "cpuUtilization": { "utilizationTarget": 0.6 }, "coolDownPeriodSec": 90 } }
If you want, you can enable predictive autoscaling to scale out ahead of predicted load. To learn whether predictive autoscaling is suitable for your workload, see Scaling based on predictions.
For more information about enabling autoscaling based on CPU utilization, see Scaling based on CPU utilization.
What's next
- Learn more about autoscaling and the different kinds of scaling signals that you can add to an autoscaling policy.
- Read about Managing autoscalers.
- Set up application-based autohealing, which periodically verifies that your application responds as expected on each of the MIG's VMs and automatically recreates unresponsive VMs.
- Learn how to apply a new configuration to all or to a subset of the VMs in a MIG by setting and applying a new instance template, all-instances configuration, or per-instance configuration.
- Learn how to add an external HTTP(S) load balancer frontend to your instance
group. For
information about other types of load balancers, see the Load balancing
overview.