If you use managed instance groups (MIGs), read this document to learn how to create, configure, and delete your MIG's autoscaler.
Before you begin
-
If you haven't already, set up authentication.
Authentication is
the process by which your identity is verified for access to Google Cloud services and APIs.
To run code or samples from a local development environment, you can authenticate to
Compute Engine as follows.
Select the tab for how you plan to use the samples on this page:
Console
When you use the Google Cloud console to access Google Cloud services and APIs, you don't need to set up authentication.
gcloud
-
Install the Google Cloud CLI, then initialize it by running the following command:
gcloud init
- Set a default region and zone.
REST
To use the REST API samples on this page in a local development environment, you use the credentials you provide to the gcloud CLI.
Install the Google Cloud CLI, then initialize it by running the following command:
gcloud init
For more information, see Authenticate for using REST in the Google Cloud authentication documentation.
-
Creating an autoscaler
Creating an autoscaler is slightly different depending on which autoscaling policy you want to use. For instructions on creating an autoscaler, see:
- Scaling based on CPU utilization
- Scaling based on load balancing serving capacity
- Scaling based on Cloud Monitoring metrics
- Scaling based on schedules
Getting information about an autoscaler
To get more information about a particular autoscaler, use the console, the
gcloud compute instance-groups managed describe
sub-command, or the get
method for a
zonal or
regional
autoscaler REST resource.
Console
- In the Google Cloud console, go to the Instance groups page.
- Click the name of a MIG from the list to open that group's overview page.
- Click Details to view the group's details, including its autoscaling settings.
gcloud
Use the instance-groups managed describe
command:
gcloud compute instance-groups managed describe INSTANCE_GROUP_NAME
If an autoscaler is attached to the group, the command returns details about the autoscaler:
... autoscaler: autoscalingPolicy: coolDownPeriodSec: 60 cpuUtilization: utilizationTarget: 0.6 maxNumReplicas: 20 minNumReplicas: 10 mode: ON scaleInControl: timeWindowSec: 300 maxScaledInReplicas: fixed: 3 calculated: 3 ...
REST
Use the
instanceGroupManagers.get
method.
For a regional MIG, replace zones/ZONE
with
regions/REGION
.
GET https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instanceGroupManagers/INSTANCE_GROUP_NAME
If an autoscaler is attached to the group, the request returns a link to the autoscaler resource.
200 OK { ... "status": { ... "autoscaler": "https://www.googleapis.com/compute/v1/projects/my-project/zones/us-east1-c/autoscalers/example-group" }, }
To retrieve details about the autoscaler resource, use the
autoscalers.get
method
for a zonal MIG or the
regionAutoscalers.get
method
for a regional MIG.
GET https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/autoscalers/example-autoscaler
200 OK { "kind": "compute#autoscaler", "id": "8744945839459481093", "creationTimestamp": "2018-09-28T13:02:50.553-07:00", "name": "example-group", "target": "https://www.googleapis.com/compute/v1/projects/my-project/zones/us-east1-c/instanceGroupManagers/example-group", "autoscalingPolicy": { "minNumReplicas": 10, "maxNumReplicas": 20, "mode": "ON", "scaleInControl": { "timeWindowSec": 60, "maxScaledInReplicas": { "calculated": 3, "percent": 15 } }, "coolDownPeriodSec": 60, "cpuUtilization": { "utilizationTarget": 0.6 } }, "zone": "https://www.googleapis.com/compute/v1/projects/my-project/zones/us-east1-c", "selfLink": "https://www.googleapis.com/compute/v1/projects/my-project/zones/us-east1-c/autoscalers/example-group", "status": "ACTIVE" }
Updating an autoscaler
When you update an autoscaler, it might take some time for the changes to propagate, and it might be a couple of minutes before your new autoscaler settings are reflected.
Console
- In the Google Cloud console, go to the Instance groups page.
- Click the name of a MIG from the list to open that group's overview page.
- Click Edit to view and update the group's current configuration, including its autoscaling settings.
- Click Save when you are done.
gcloud
Use the
update-autoscaling
command.
gcloud compute instance-groups managed update-autoscaling INSTANCE_GROUP_NAME \ --max-num-replicas MAX_NUM ...
For instructions on how to create an autoscaler, see Creating an autoscaler.
REST
To update an autoscaler resource, use the
autoscalers.patch
method
for a zonal MIG or regionAutoscalers.patch
method
for a regional MIG.
Provide a request body that contains the new configuration.
PATCH https://compute.googleapis.com/compute/v1/projects/my-project/zones/us-central1-f/autoscalers/example-autoscaler { "autoscalingPolicy": { "maxNumReplicas": 20 } }
200 OK { "kind": "compute#operation", "id": "4244494732310423322", "name": "operation-1556912627871-58800f8216ed7-74ab1720-7d360603", "zone": "https://www.googleapis.com/compute/v1/projects/my-project/zones/us-central1-f", "operationType": "compute.autoscalers.patch", "targetLink": "https://www.googleapis.com/compute/v1/projects/my-project/zones/us-central1-f/autoscalers/example-autoscaler", "targetId": "340775527929467142", "status": "RUNNING", ... }
When you perform any requests that modify data, a zoneOperations or regionOperations resource is returned, and you can query the operation to check the status of your change.
Using predictive autoscaling
Predictive autoscaling uses historical data to scale out your group ahead of anticipated load. It works best if your workload meets the following criteria:
- Your application takes a long time to initialize—for example, if you configure an initialization period of more than 2 minutes.
- Your workload varies predictably with daily or weekly cycles.
For more information, see Scaling based on predictions.
Turning off or restricting an autoscaler
Turn off an autoscaler to temporarily prevent it from scaling your MIG, or restrict your autoscaler so that it can only scale out your MIG. This feature is useful when you want to:
- Investigate VM instances without interference from scaling in.
- Reconfigure multiple properties of your MIG without scaling actions being triggered while your group is only partially reconfigured.
- Maintain MIG capacity for a fast rollback while redirecting a workload to a new MIG.
- Enable predictive autoscaling
later. Predictive autoscaling requires an autoscaling policy in order to start
gathering load history on which to base predictions. The autoscaler detects
this history even when its mode is set to
OFF
.
If and when you re-enable the autoscaler, the autoscaler automatically returns to normal operation.
Use the instructions provided in this section to set the autoscaler's mode. The following modes are available:
- Off: Temporarily disables autoscaling. Use this mode to prevent automatic changes of the MIG's size. The autoscaling configuration remains intact so you can re-enable autoscaling later.
- Only scale out: Restrict autoscaling only to adding new VM instances. Use this mode to protect the group from shrinking and allow the group to provision extra VMs when load increases.
- On: Enables all autoscaling operations per its policy.
Console
- In the Google Cloud console, go to the Instance groups page.
- Click the name of a MIG from the list to open that group's overview page.
- Click Edit to view the group's current configuration, including its autoscaling settings.
- Under Autoscaling, set the Autoscaling mode to disable or restrict autoscaling for the group, or to turn the autoscaler back on.
- Click Save when you are done.
gcloud
To disable, restrict, or re-enable an autoscaler, use the
update-autoscaling
command
with the --mode
flag.
gcloud compute instance-groups managed update-autoscaling INSTANCE_GROUP_NAME \ --mode MODE
Replace the following:
MODE
:off
to disable the autoscaler but maintain its configurationonly-scale-out
to restrict the autoscaler to adding VM instances onlyon
to re-enable all autoscaler activities according to its policy
REST
To update the mode of an autoscaler resource, use the
autoscalers.patch
method
for a zonal MIG or the regionAutoscalers.patch
method
for a regional MIG.
Provide a request body that includes the autoscalingPolicy.mode
property.
PATCH https://compute.googleapis.com/compute/v1/projects/my-project/regions/us-central1-f/autoscalers?autoscaler=my-autoscaler { "autoscalingPolicy": { "mode":"MODE" } }
Replace the following:
MODE
:OFF
to disable the autoscaler but maintain its configurationONLY_SCALE_OUT
to restrict the autoscaler to adding instances onlyON
to re-enable all autoscaler activities according to its policy
When you set the autoscaling mode to ONLY_SCALE_OUT
, the autoscaler behaves as
follows:
- The autoscaler does not decrease the MIG's
targetSize
value regardless of decreases in load or changes to the autoscaler configuration. - If you manually change the target size of a zonal MIG, the autoscaler overrides your manually provided size if it is smaller than the autoscaler's recommended size.
- You cannot manually change the target size of a regional MIG.
- If you set the
autoscalingPolicy.maxNumReplicas
field to a lower value than the group's currenttargetSize
value while the autoscaler's mode is set toONLY_SCALE_OUT
, the autoscaler does not reduce the number of instances in the group. As usual, the autoscaler continuously recomputes the group's recommended size and might decrease the group's recommended size to comply with the new maximum, but the group is not scaled in. - The
autoscalers.status
field reports a warning: "Autoscaling operates in a restricted mode: ONLY_SCALE_OUT."
When you set the autoscaler's mode to OFF
, the autoscaler behaves as follows:
- The autoscaler does not change the MIG's
targetSize
value in response to changes in load or in autoscaler configuration. As usual, the autoscaler continuously recomputes the group's recommended size and might decrease the group's recommended size to comply with the new maximum, but the group is not scaled in. - You can manually change the target size of a zonal or a regional MIG. The
minNumReplicas
andmaxNumReplicas
values of the autoscaling policy do not affect the size you set. - If you turn off autoscaling for a regional MIG in which proactive instance redistribution is enabled, and if the MIG has an uneven distribution of instances across zones, then the group proactively deletes or creates instances in its zones to reestablish an even distribution.
- The
autoscalers.status
field reports a warning: "Autoscaling operates in a restricted mode: OFF."
Controlling the scale-in rate of an autoscaler
If your workloads take many minutes to initialize, configure scale-in controls to reduce the risk of response latency and outages due to abrupt scale-in events. Specifically, if you routinely expect a load spike to follow soon after a decline in load, you can limit the scale-in rate. Limiting the scale-in rate prevents the autoscaler from reducing a MIG's size by more VM instances than your workload can tolerate to lose.
Configuring scale-in controls
Configuring scale-in controls is optional. By default, scale-in controls are not configured. When not configured, the autoscaler still relies on its default stabilization mechanism. That is, it maintains the recommended size at a level required to serve peak load, observed during the stabilization period.
Console
To configure scale-in controls for an autoscaled MIG:
In the Google Cloud console, go to the Instance groups page.
Click the name of an autoscaled MIG from the list to open that group's overview page.
Click Edit to view the group's current configuration, including its autoscaling settings.
Under Autoscaling, click Scale-in controls, then select Enable scale-in controls.
Under Don't scale in by more than, specify the maximum number or percent of instances that can be removed from the group at a time.
Under Over the course of, specify how often instances can be removed from the group.
Click Save.
gcloud
You can configure scale-in controls when creating an autoscaler or when updating an autoscaler.
Configuring scale-in controls when creating an autoscaler
Set scale-in controls when creating an autoscaler for a MIG by
using the --scale-in-control
flag with the gcloud
compute instance-groups managed set-autoscaling
command.
For example, use the following command to configure autoscaling for an
example-group
:
gcloud compute instance-groups managed set-autoscaling INSTANCE_GROUP_NAME \ --target-cpu-utilization 0.6 \ --max-num-replicas 50 \ --scale-in-control max-scaled-in-replicas=MAX_SCALE_IN_REPLICAS,time-window=TIME_WINDOW
Configuring scale-in controls when updating an autoscaler
Update scale-in controls in a MIG's existing autoscaler
by using the --scale-in-control
flag with the
gcloud compute instance-groups managed update-autoscaling
command.
For example, use the following command to set scale-in controls in an
existing autoscaling configuration for example-group
:
gcloud compute instance-groups managed update-autoscaling INSTANCE_GROUP_NAME \ --scale-in-control max-scaled-in-replicas=MAX_SCALE_IN_REPLICAS,time-window=TIME_WINDOW
Replace the following:
INSTANCE_GROUP_NAME
: the name of the MIG to update.MAX_SCALE_IN_REPLICAS
: the maximum number of VMs allowed to be deducted from the peak size, taken from the specified trailing time window. The specified number of VM instances can be scaled in all at once, so your service should be able to afford losing this many VMs all at once. You can specify either a number of VMs or a percentage. Use the%
sign for percentages; for example:50%
.TIME_WINDOW
: trailing time window to take the peak size from. Autoscaling won't scale in by more than the maximum allowed number of replicas from the peak size taken during this trailing time window. Specify this value in seconds within a [60, 3600] interval.
For example, say you set the time window to 1800 seconds (30 minutes). When calculating the current recommended size for the MIG, the autoscaler uses the following logic:
- Take the peak size from the last 30 minutes (for example, 100 VMs)
- Take
max-scaled-in-replicas
(for example, 10 VMs) - Set the lower bound of the recommended size to: peak size minus
max-scaled-in-replicas
(100 - 10 = 90 VMs)
REST
Configure scale-in controls by setting the maxScaledInReplicas
and timeWindowSec
fields within the autoscalingPolicy.scaleInControl
structure in a zonal or
regional
autoscaler resource. There are no default values for these fields, you must
provide values for both fields.
You can configure scale-in controls when creating an autoscaler or when updating an autoscaler.
Configuring scale-in controls when creating an autoscaler
For a zonal MIG, use the autoscalers.insert
method.
For a regional MIG, use the regionAutoscalers.insert
method.
POST https://www.googleapis.com/compute/v1/projects/PROJECT_ID/regions/REGION/autoscalers { "name": "AUTOSCALER_NAME", "target": "https://www.googleapis.com/compute/v1/projects/PROJECT_ID/regions/REGION/instanceGroupManagers/INSTANCE_GROUP_NAME", "autoscalingPolicy": { "minNumReplicas": 1, "maxNumReplicas": 5, "coolDownPeriodSec": 60, "cpuUtilization": { "utilizationTarget": 0.8 }, "scaleInControl": { "maxScaledInReplicas": { "fixed": MAX_SCALE_IN_REPLICAS }, "timeWindowSec": TIME_WINDOW } } }
For more information about creating an autoscaler, refer to the following articles:
Configuring scale-in controls when updating an autoscaler
For a zonal MIG, use the autoscalers.patch
method.
For a regional MIG, use the regionAutoscalers.patch
method.
PATCH https://www.googleapis.com/compute/v1/projects/PROJECT_ID/regions/REGION/autoscalers?autoscaler=AUTOSCALER_NAME { "autoscalingPolicy": { "minNumReplicas": 1, "maxNumReplicas": 5, "coolDownPeriodSec": 60, "cpuUtilization": { "utilizationTarget": 0.8 }, "scaleInControl": { "maxScaledInReplicas": { "fixed": MAX_SCALE_IN_REPLICAS }, "timeWindowSec": TIME_WINDOW } } }
Replace the following:
AUTOSCALER_NAME
: the name of the autoscaler to create. You can name your autoscaler after the MIG that will use it or name it something else.INSTANCE_GROUP_NAME
: the name of the MIG to add the autoscaler to. For a regional MIG, replacezones/ZONE
withregions/REGION
.MAX_SCALE_IN_REPLICAS
: the maximum number of VMs allowed to be deducted from the peak recommended target size, taken from the specified trailing time window. The specified number of VM instances can be scaled in all at once, so your service should be able to afford to lose this many VMs all at once. You can specify either a number of VMs or a percentage. Use themaxScaledInReplicas.percentage
to specify a percent value.TIME_WINDOW
: the trailing time window to take the peak recommended size from. Autoscaling won't scale in by more than the maximum allowed number of replicas from the peak recommended size taken during this trailing time window. Specify this value in seconds within a [60, 3600] interval; for example:1800
.
For example, say you set the time window to 1800 seconds (30 minutes). When calculating the current recommended size for the MIG, the autoscaler uses the following logic:
- Take the peak size from the last 30 minutes (for example, 100 VMs)
- Take
max-scaled-in-replicas
(for example, 10 VMs) - Set the lower bound of the recommended size to: peak size minus
max-scaled-in-replicas
(100 - 10 = 90 VMs)
For more information about how scale-in controls work, see Understanding autoscaler decisions.
Getting current configuration of scale-in controls
To get the current configuration of scale-in controls, see Getting information about an autoscaler.
Removing scale-in controls
You can remove scale-in controls to lift restrictions on the timing and magnitude of scale-in operations using the Google Cloud CLI or the Compute Engine API.
Without scale-in controls, the autoscaler still relies on its default stabilization mechanism. Specifically, it maintains a recommended size at a level required to serve peak load, observed during the stabilization period.
Console
To remove scale-in controls for an autoscaled MIG:
In the Google Cloud console, go to the Instance groups page.
Click the name of an autoscaled MIG from the list to open that group's overview page.
Click Edit to view the group's current configuration, including its autoscaling settings.
Under Autoscaling, click Scale-in controls, then clear the Enable scale-in controls checkbox.
Click Save.
gcloud
Remove scale-in controls by using the --clear-scale-in-control
flag with
the gcloud compute instance-groups managed update-autoscaling
command.
For example, use the following command to remove scale-in controls from
the autoscaling configuration for example-group
:
gcloud compute instance-groups managed update-autoscaling example-group \ --clear-scale-in-control
REST
To remove scale-in controls, use the
autoscalers.patch
method for a zonal MIG or use the
regionAutoscalers.patch
method
for a regional MIG,
and provide empty configuration for scale-in controls.
PATCH https://www.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/autoscalers?autoscaler=AUTOSCALER_NAME { "autoscalingPolicy": { "scaleInControl": null } }
Replace the following:
AUTOSCALER_NAME
: the name of the autoscaler to update. To get a list of existing autoscalers and their target MIGs, use theautoscalers.aggregatedList
method.
Deleting an autoscaler
You can permanently delete your autoscaler resource, and its history. If you want to temporarily stop autoscaling and keep your autoscaler resource and its configuration and history, disable your autoscaler instead.
Console
In the Google Cloud console, go to the Instance groups page.
Click the name of a MIG from the list to open that group's overview page.
Click Edit to view the group's current configuration, including its autoscaling settings.
Under Autoscaling, from the Autoscaling mode drop-down list select Delete autoscaling configuration to stop the autoscaler and delete its configuration.
Click Save when you are done.
gcloud
Use the stop-autoscaling
command
to stop an autoscaler and delete its configuration.
gcloud compute instance-groups managed stop-autoscaling INSTANCE_GROUP_NAME
Stopping an autoscaler deletes it from the MIG. If you want to restart
the autoscaler, you must recreate it by using the
set-autoscaling
command.
If you delete a MIG using the gcloud CLI, any autoscalers attached to the MIG are also deleted.
REST
To stop an autoscaler and delete its configuration, use the
autoscalers.delete
method
for a zonal MIG or use the
regionAutoscalers.delete
method
for a regional MIG.
DELETE https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/autoscalers/AUTOSCALER_NAME
Feedback
We want to learn about your use cases, challenges, and feedback about autoscaling. Share your feedback with our team at mig-discuss@google.com.
What's next
- Learn how autoscalers make decisions.
- Learn how to use multiple autoscaling signals to scale your group.