Scale based on Monitoring metrics

Autoscaling based on Cloud Monitoring metrics lets you adjust the capacity needed according to measurements from your app. When you autoscale a MIG based on a metric, the autoscaler creates VMs when the metric value increases and deletes VMs when the value decreases.

For example, you can define how many VMs you need per user count, latency, or the number of messages in a Pub/Sub subscription. You can use either the built-in metrics provided by the Monitoring service, or the custom metrics that you export from your application.

This document describes how to autoscale a managed instance group (MIG) based on Monitoring metrics.

You can also autoscale a MIG based on CPU utilization, serving capacity of an external HTTP(S) load balancer, or schedules.

Before you begin

Limitations

Scaling based on Monitoring metrics are restricted by the limitations for all autoscalers as well as the following limitations:

  • You can configure autoscaling based on up to 5 Monitoring metrics per MIG.
  • You cannot autoscale based on logs-based metrics.

Configure autoscaling based on Monitoring metrics

You can use a Monitoring metric value for autoscaling in two different ways:

  • Utilization target: If you want the autoscaler to maintain a metric at a specific value, configure a utilization target. The autoscaler creates VMs when the metric value is above the target and deletes VMs when the metric value is below the target. This is useful for metrics like network traffic, memory/disk usage, or average latency of your application. The following diagram shows how an autoscaler adds and removes VMs in response to a metric value to maintain a utilization target.

    Autoscaler adding and removing VMs to maintain a utilization target.

  • Single instance assignment: If you want to autoscale based on how much work is available to assign to each VM, configure a single instance assignment. The single instance assignment that you specify represents how much work you expect each VM to handle. The autoscaler divides the metric value by the single instance assignment value to calculate how many VMs are needed. For example, if the metric is equal to 100 and the single instance assignment is 5, then the autoscaler creates 20 VMs in the MIG. This is useful for metrics that reflect the amount of work like Pub/Sub queue length or batch jobs count. The following diagram shows the proportional relationship between the metric value and the number of VMs when scaling with single instance assignment.

    The proportional relationship between metric value and number of instances.

Autoscale to maintain a metric at a target value

When you want to maintain a metric at a target value, specify a utilization target. The autoscaler creates VMs when the metric value is above the target and deletes VMs when the metric value is below the target.

  • If the metric comes from each VM in your MIG, then the autoscaler takes the average metric value across all VMs in the MIG and compares it with the utilization target. For example, if you want to autoscale using the tcp_connections metric that gives the number of TCP connections on a VM, then the autoscaler takes an average number of TCP connections across all VMs in the MIG to compare with the target. When you use such metrics that originate from a VM, the MIG cannot scale in to 0 because the autoscaler requires at least one VM to publish a metric value.

  • If the metric applies to the whole MIG and does not come from the VMs in your MIG, then the autosaler compares the metric value with the utilization target. For example, you can use a custom metric that measures the latency of your application.

When your metric has multiple values, apply a filter to autoscale using an individual value from the metric. For more details about metric filters and other fields that you can use in your configuration, see Monitoring metrics concepts.

Configure autoscaling using the Google Cloud console, the gcloud CLI, or the Compute Engine API.

Console

  1. In the Google Cloud console, go to the Instance groups page.

    Go to Instance groups

  2. If you do not have a managed instance group, create one. Otherwise, click the name of a MIG from the list to open the instance group overview page.

  3. On the instance group overview page, click Edit.

  4. If no autoscaling configuration exists:

    1. Under Autoscaling, click Configure autoscaling.
    2. Under Autoscaling mode, select On: add and remove instances to the group to enable autoscaling.
  5. Under Autoscaling, in the Autoscaling metrics section, if a metric exists, click it to edit it, or click Add metric to add another metric.

  6. Set the Metric type to Cloud Monitoring metric.

  7. In the Metric export scope field:

    1. If you want to scale using a metric that comes from each VM in the MIG, select Time series per instance. For this scope, the Monitored resource type is gce_instance.
    2. If you want to scale using a metric that is not specific to individual VMs, select Single time series per group.

    The Monitoring service records the monitored data as a time series, which includes a set of time-stamped metric values, and information about the metric and its monitored resource.

  8. In the Metric identifier, enter the metric name in the following format: example.googleapis.com/path/to/metric. If you use a custom metric, it must meet the custom metric requirements.

  9. If the source of the metric that you use is a VM, the Monitored resource type field shows gce_instance by default. For a metric that is not specific to a VM, you must specify the monitored resource type.

  10. If your metric has multiple values, you can provide an Additional filter expression to use an individual value for autoscaling. To learn more about metric filter, see Monitoring metrics concepts.

  11. If your metric export scope is Single time series per group, make sure that the Scaling policy is selected as Utilization target.

  12. In the Utilization target field, specify the value that the autoscaler must maintain. This must be a positive number. For example, both 24.5 and 100 are acceptable values.

  13. In the Utilization target type field, verify that the target type corresponds to the metric's kind of measurement. For accurate comparisons, if the utilization target is measured per seconds, then use DELTA_PER_SECOND as the target type. Likewise, use DELTA_PER_MINUTE for a utilization target measured per minutes.

    • Gauge: The autoscaler computes the average value of the data collected in the last couple of minutes and compares that to the utilization target.
    • Delta / min: The autoscaler calculates the average rate of growth per minute and compares that to the utilization target.
    • Delta / second: The autoscaler calculates the average rate of growth per second and compares that to the utilization target.
  14. When you are finished adding metric, click Done.

  15. To close the Instance groups page, click Save.

gcloud

To configure autoscaling based on Monitoring metrics, use the set-autoscaling command.

Use the following command to autoscale based on a Monitoring metric with a utilization target.

gcloud compute instance-groups managed set-autoscaling MIG_NAME \
  --max-num-replicas=MAX_INSTANCES \
  --min-num-replicas=MIN_INSTANCES \
  --update-stackdriver-metric=METRIC_URL \
  --stackdriver-metric-utilization-target=TARGET_VALUE \
  --stackdriver-metric-utilization-target-type=TARGET_TYPE

If your metric has multiple values and you want to use an individual value for autoscaling, then use the --stackdriver-metric-filter flag as given in the following command.

gcloud compute instance-groups managed set-autoscaling MIG_NAME \
  --max-num-replicas=MAX_INSTANCES \
  --min-num-replicas=MIN_INSTANCES \
  --update-stackdriver-metric=METRIC_URL \
  --stackdriver-metric-utilization-target=TARGET_VALUE \
  --stackdriver-metric-utilization-target-type=TARGET_TYPE \
  --stackdriver-metric-filter="METRIC_FILTER"

Replace the following:

  • MIG_NAME: the MIG in which you want to add an autoscaler.
  • MAX_INSTANCES: the maximum number of VMs that the MIG can have.
  • MIN_INSTANCES: the minimum number of VMs that the MIG needs to have.
  • METRIC_URL: a protocol-free URL of a Monitoring metric. For example, 'compute.googleapis.com/instance/uptime'. If you use a custom metric, it must meet the custom metric requirements.
  • TARGET_VALUE: the metric value that the autoscaler attempts to maintain.
  • TARGET_TYPE: the value type for the metric.
    • gauge: the autoscaler computes the average value of the data collected in the last couple of minutes and compares that to the utilization target.
    • delta-per-minute: the autoscaler calculates the average rate of growth per minute and compares that to the utilization target.
    • delta-per-second: the autoscaler calculates the average rate of growth per second and compares that to the utilization target. For accurate comparisons, if you set the utilization target in seconds, use delta-per-second as the target type. Likewise, use delta-per-minute for a utilization target in minutes.
  • METRIC_FILTER: apply a filter to use an individual value from a metric having multiple values and to specify the monitored resource type. If you use a metric that comes from each VM, you do not have to specify the monitored resource type because gce_instance is used as default. For other metrics, use resource.type in the filter expression to specify the monitored resource. To learn more about metric filter, see Monitoring metrics concepts.

To see a full list of available commands and flags for the gcloud CLI, see the gcloud reference.

API

To configure autoscaling based on Monitoring metrics for a zonal MIG, use the autoscalers resource or, for a regional MIG, use the regionAutoscalers resource.

Make the following call to autoscale a zonal MIG based on a Monitoring metric with a utilization target.

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/autoscalers
{
 "name": "AUTOSCALER_NAME",
 "target": "zones/ZONE/instanceGroupManagers/MIG_NAME",
 "autoscalingPolicy": {
  "maxNumReplicas": MAX_INSTANCES,
  "minNumReplicas": MIN_INSTANCES,
  "customMetricUtilizations": [
    {
      "metric": "METRIC_URL",
      "utilizationTarget": TARGET_VALUE,
      "utilizationTargetType": TARGET_TYPE
    }
  ],
 }
}

If your metric has multiple values and you want to use an individual value for autoscaling, then use the filter parameter as given in the following API call.

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/autoscalers
{
 "name": "AUTOSCALER_NAME",
 "target": "zones/ZONE/instanceGroupManagers/MIG_NAME",
 "autoscalingPolicy": {
  "maxNumReplicas": MAX_INSTANCES,
  "minNumReplicas": MIN_INSTANCES,
  "customMetricUtilizations": [
    {
      "metric": "METRIC_URL",
      "utilizationTarget": TARGET_VALUE,
      "utilizationTargetType": TARGET_TYPE,
      "filter": "METRIC_FILTER"
    }
  ],
 }
}

Replace the following:

  • PROJECT_ID: your project ID.
  • ZONE: the zone where the MIG is located.
  • AUTOSCALER_NAME: the name of the autoscaler.
  • MIG_NAME: the MIG in which you want to add an autoscaler.
  • MAX_INSTANCES: the maximum number of VMs that the MIG can have.
  • MIN_INSTANCES: the minimum number of VMs that the MIG needs to have.
  • METRIC_URL: a protocol-free URL of a Monitoring metric. For example, 'compute.googleapis.com/instance/uptime'. If you use a custom metric, it must meet the custom metric requirements.
  • TARGET_VALUE: the metric value that the autoscaler attempts to maintain.
  • TARGET_TYPE: the value type for the metric.
    • GAUGE: The autoscaler computes the average value of the data collected in the last couple of minutes and compares that to the utilization target.
    • DELTA_PER_SECOND: The autoscaler calculates the average rate of growth per minute and compares that to the utilization target.
    • DELTA_PER_MINUTE: The autoscaler calculates the average rate of growth per second and compares that to the utilization target. For accurate comparisons, if you set the utilization target in seconds, use DELTA_PER_SECOND as the target type. Likewise, use DELTA_PER_MINUTE for a utilization target in minutes.
  • METRIC_FILTER: apply a filter to use an individual value from a metric with multiple values and to specify the monitored resource type. If you use a metric that comes from each VM, you do not have to specify the monitored resource type because gce_instance is used as default. For other metrics, you must specify the monitored resource using resource.type selector. To learn more about metric filter, see Monitoring metrics concepts.

Autoscale based on work available for each VM in a MIG

When you want to autoscale based on the quantity of work that is available for each VM in a MIG, specify a single instance assignment. The value of the single instance assignment that you set indicates how much work you expect each VM to handle.

A metric value of 0 indicates that there is no work for your MIG to complete. If your MIG's minimum number of instances is set to 0 and your metric value drops to 0, then the MIG scales in to 0 until the metric value increases.

When your metric has multiple values, apply a filter to autoscale using an individual value from the metric. For more details about metric filters and other fields that you can use in your configuration, see Monitoring metrics concepts.

You can configure autoscaling using the Google Cloud console, the gcloud CLI, or the Compute Engine API.

Console

  1. In the Google Cloud console, go to the Instance groups page.

    Go to Instance groups

  2. If you do not have a managed instance group, create one. Otherwise, click the name of a MIG from the list to open the instance group overview page.

  3. On the instance group overview page, click Edit.

  4. If no autoscaling configuration exists:

    1. Under Autoscaling, click Configure autoscaling.
    2. Under Autoscaling mode, select On: add and remove instances to the group to enable autoscaling.
  5. In the Autoscaling section, if a metric already exists, click it to edit it, or click Add metric to add another metric.

  6. Set the Metric type to Cloud Monitoring metric.

  7. In the Metric export scope field, select Single time series per group.

    The Monitoring service records the monitored data as a time series, which includes a set of time-stamped metric values, and information about the metric and its monitored resource.

  8. In the Metric identifier field, specify the metric name in the following format: example.googleapis.com/path/to/metric. If you use a custom metric, it must meet the custom metric requirements.

  9. Specify the Monitored resource type.

  10. If your metric has multiple values, you can provide an Additional filter expression to use an individual value for autoscaling. To learn more about metric filter, see Monitoring metrics concepts.

  11. In the Scaling policy field, select Single Instance assignment.

  12. Provide a Single instance assignment value that represents the amount of work to assign to each VM in the MIG.

  13. When you are finished adding metric, click Done.

  14. To close the Instance groups page, click Save.

gcloud

To configure autoscaling based on Monitoring metrics, use the set-autoscaling command.

In the command, specify the --stackdriver-metric-single-instance-assignment flag to indicate the amount of work that you expect each VM in the group to handle.

The following command creates an autoscaler based on work assignment for each VM.

gcloud compute instance-groups managed set-autoscaling MIG_NAME \
    --max-num-replicas=MAX_INSTANCES \
    --min-num-replicas=MIN_INSTANCES \
    --update-stackdriver-metric=METRIC_URL \
    --stackdriver-metric-filter="METRIC_FILTER" \
    --stackdriver-metric-single-instance-assignment=INSTANCE_ASSIGNMENT

Replace the following:

  • MIG_NAME: the name of the MIG where you want to add an autoscaler.
  • MAX_INSTANCES: the maximum number of VMs that the MIG can have.
  • MIN_INSTANCES: the minimum number of VMs that the MIG needs to have.
  • METRIC_URL: a protocol-free URL of a Monitoring metric. For example, compute.googleapis.com/instance_group/size. If you use a custom metric, it must meet the custom metric requirements.
  • METRIC_FILTER: apply a filter to use an individual value from a metric with multiple values and to specify the monitored resource type. To learn more about metric filter, see Monitoring metrics concepts.
  • INSTANCE_ASSIGNMENT: the amount of work to assign to each VM instance in the MIG.

API

To configure autoscaling based on Monitoring metrics for a zonal MIG, use the autoscalers resource or, for a regional MIG, use the regionAutoscalers resource.

Use the singleInstanceAssignment parameter to specify the amount of work that you expect each VM to handle.

For example, make the following call to create an autoscaler that scales a zonal MIG based on the instance assignment.

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/autoscalers

{
 "name": "AUTOSCALER_NAME",
 "target": "zones/ZONE/instanceGroupManagers/MIG_NAME",
 "autoscalingPolicy": {
  "maxNumReplicas": MAX_INSTANCES,
  "minNumReplicas": MIN_INSTANCES,
  "customMetricUtilizations": [
    {
      "metric": "METRIC_URL",
      "filter": "METRIC_FILTER",
      "singleInstanceAssignment": INSTANCE_ASSIGNMENT
    }
  ],
 }
}

Replace the following:

  • PROJECT_ID: your project ID.
  • ZONE: the zone where the MIG is located.
  • AUTOSCALER_NAME: the name of the autoscaler.
  • MIG_NAME: the name of the MIG where you want to add an autoscaler.
  • MAX_INSTANCES: the maximum number of VMs that the MIG can have.
  • MIN_INSTANCES: the minimum number of VMs that the MIG needs to have.
  • METRIC_URL: a protocol-free URL of a Monitoring metric. For example, compute.googleapis.com/instance_group/size. If you use a custom metric, it must meet the custom metric requirements.
  • METRIC_FILTER: apply a filter to use an individual value from a metric with multiple values and to specify the monitored resource type. To learn more about metric filter, see Monitoring metrics concepts.
  • INSTANCE_ASSIGNMENT: the amount of work to assign to each VM instance in the MIG.

Examples for autoscaling based on metrics

This section provides some examples of metrics used for autoscaling. For a complete list of metrics, see Google Cloud metrics.

Autoscale based on a custom metric

There might be a situation when the metric providing a relevant signal does not represent a total amount of available work or another resource applicable to the group, but instead an average, a percentile, or some other statistical property. For this example, assume that you scale based on the group's average processing latency.

Assume the following setup:

  • A zonal MIG named our-instance-group is assigned to perform a particular task. The group is located in zone us-central1-a.
  • You have a Monitoring custom metric that exports a value that you would like to maintain at a particular level. For this example, assume the metric represents the average latency of processing queries assigned to the group.
    • The custom metric is named: custom.googleapis.com/example_average_latency.
    • The custom metric has a label with a key named group_name and value equal to the MIG's name, our-instance-group.
    • The custom metric exports data for the global monitored resource, that is, it is not associated with any specific VM.

You have determined that when the metric value is above some specific value, you need to add more VMs to the group to handle the load, while when it is below that value, you can free up some resources. You want the autoscaler to gradually add or remove VMs at a rate that is proportional to how much the metric is above or below the target. For this example, assume that you have determined your target value to be 250.

You can configure autoscaling for the group using a utilization target of 250, which represents the metric value that the autoscaler will attempt to maintain:

Console

  1. In the Google Cloud console, go to the Instance groups page.

    Go to Instance groups

  2. Click the name of your MIG from the list to open the instance group overview page.

  3. On the instance group overview page, click Edit.

  4. If no autoscaling configuration exists:

    1. Under Autoscaling, click Configure autoscaling.
    2. Under Autoscaling mode, select On: add and remove instances to the group to enable autoscaling.
  5. Under Autoscaling, in the Autoscaling metrics section, click Add metric.

  6. Set the Metric type to Cloud Monitoring metric.

  7. Set the following fields:

    • Metric export scope: Single Time series per group
    • Metric identifier: custom.googleapis.com/example_average_latency
    • Monitored resource type: global
    • Additional filter expression: metric.labels.group_name= "our-instance-group"
    • Scaling policy: Utilization target
    • Utilization target: 250
    • Utilization target type: Delta / sec
  8. Click Done.

  9. When you are finished, click Save.

gcloud

gcloud compute instance-groups managed set-autoscaling our-instance-group \
  --zone=us-central1-a \
  --max-num-replicas=50 \
  --min-num-replicas=0 \
  --update-stackdriver-metric=custom.googleapis.com/example_average_latency \
  --stackdriver-metric-filter="metric.labels.group_name = \"our-instance-group\" AND resource.type = \"global\"" \
  --stackdriver-metric-utilization-target=250 \
  --stackdriver-metric-utilization-target-type=delta-per-second

API

POST https://compute.googleapis.com/compute/v1/projects/my-project/zones/us-central1-a/autoscalers
{
"name": "our-instance-group-autoscaler",
"target": "https://www.googleapis.com/compute/v1/projects/my-project/zones/us-central1-a/instanceGroupManagers/our-instance-group",
"autoscalingPolicy": {
  "maxNumReplicas": 50,
  "minNumReplicas": 0,
  "customMetricUtilizations": [
    {
      "filter": "metric.labels.group_name=\"our-instance-group\" AND resource.type = \"global\"",
      "utilizationTargetType": "delta-per-second",
      "utilizationTarget": 250.0,
      "metric": "custom.googleapis.com/example_average_latency"
    }
  ]
}
}

Autoscale based on unacknowledged messages in Pub/Sub

Assume the following setup:

  • An active Pub/Sub topic receives messages from some source.
  • An active Pub/Sub subscription is connected to the topic in a pull configuration. The subscription is named our-subscription.
  • A pool of workers is pulling messages from that subscription and processing them. The pool is a zonal MIG named our-instance-group and is located in zone us-central1-a. The pool must not exceed 100 workers, and should scale in to 0 workers when there are no unacknowledged messages.
  • On average, a worker processes a single message in one minute.

To determine the optimal instance assignment value, consider several approaches:

  • To process all unacknowledged messages as fast as possible, you can choose 1 as the instance assignment value. This creates one VM instance for each unacknowledged message (limited to the maximum number of VMs in our group). However, this can cause overprovisioning. In the worst case, a VM is created to process just one message before the autoscaler shuts it down, which consumes resources for much longer than doing actual work.
    • Note that if the workers are able to process multiple messages concurrently, it makes sense to increase the value to the number of concurrent processes.
    • Note that, in this example, it does not make sense to set the value below 1 because one message cannot be processed by more than one worker.
  • Alternatively, if processing latency is less important than resource utilization and overhead costs, you can calculate how many messages each VM must process within its lifetime to be considered efficiently utilized. Take into account startup and shutdown time and the fact that autoscaling does not immediately delete VMs. For example, assuming that startup and shutdown time takes about 5 minutes in total and given that autoscaler deletes VMs only after the 10-minute stabilization period, you calculate that it is efficient to create an additional VM in the group as long as it can process at least 15 messages before the autoscaler shuts it down, which results in, at most, 25% overhead due to the total time it takes to create, start, and shutdown the VM. In this case, you can choose 15 as the instance assignment value.
  • Both approaches can be balanced out, resulting in a number between 1 and 15, depending on which factor takes priority, processing latency versus resource utilization.

Looking at the available Pub/Sub metrics, we find a metric that represents the number of unacknowledged messages in a subscription: subscription/num_undelivered_messages.

Note that this metric exports the total number of messages in the subscription, including messages that are currently being processed but that are not yet acknowledged. Using a metric that does not include the messages being processed is not recommended because such a metric can drop down to 0 when there is still work being done, which prompts autoscaling to scale in and possibly interrupt the actual work.

You can configure autoscaling for the unacknowledged messages in the subscription:

Console

  1. In the Google Cloud console, go to the Instance groups page.

    Go to Instance groups

  2. Click the name of your MIG from the list to open the instance group overview page.

  3. On the instance group overview page, click Edit.

  4. If no autoscaling configuration exists:

    1. Under Autoscaling, click Configure autoscaling.
    2. Under Autoscaling mode, select On: add and remove instances to the group to enable autoscaling.
  5. Under Autoscaling, in the Autoscaling metrics section, click Add metric.

  6. Set the Metric type to Cloud Monitoring metric.

  7. Set the following fields:

    • Metric export scope: Single Time series per group
    • Metric identifier: pubsub.googleapis.com/subscription/num_undelivered_messages
    • Monitored resource type: pubsub_subscription
    • Additional filter expression: resource.labels.subscription_id= "our-subscription"
    • Scaling policy: Single instance assignment
    • Single instance assignment: 15
  8. Click Done.

  9. When you are finished, click Save.

gcloud

gcloud compute instance-groups managed set-autoscaling our-instance-group \
  --zone=us-central1-a \
  --max-num-replicas=100 \
  --min-num-replicas=0 \
  --update-stackdriver-metric=pubsub.googleapis.com/subscription/num_undelivered_messages \
  --stackdriver-metric-filter="resource.type=\"pubsub_subscription\" AND resource.labels.subscription_id=\"our-subscription\"" \
  --stackdriver-metric-single-instance-assignment=15

API

POST https://compute.googleapis.com/compute/v1/projects/my-project/zones/us-central1-a/autoscalers
{
"name": "our-instance-group-autoscaler",
"target": "https://www.googleapis.com/compute/v1/projects/my-project/zones/us-central1-a/instanceGroupManagers/our-instance-group",
"autoscalingPolicy": {
  "maxNumReplicas": 100,
  "minNumReplicas": 0,
  "customMetricUtilizations": [
    {
      "singleInstanceAssignment": 15.0,
      "filter": "resource.type = \"pubsub_subscription\" AND resource.labels.subscription_id=\"our-subscription\"",
      "metric": "pubsub.googleapis.com/subscription/num_undelivered_messages"
    }
  ]
}
}

Autoscale based on incoming network traffic

Configure autoscaling based on the incoming network traffic to VMs in your MIG:

Console

  1. In the Google Cloud console, go to the Instance groups page.

    Go to Instance groups

  2. Click the name of your MIG from the list to open the instance group overview page.

  3. On the instance group overview page, click Edit.

  4. If no autoscaling configuration exists:

    1. Under Autoscaling, click Configure autoscaling.
    2. Under Autoscaling mode, select On: add and remove instances to the group to enable autoscaling.
  5. Under Autoscaling, in the Autoscaling metrics section, click Add metric.

  6. Set the Metric type to Cloud Monitoring metric.

  7. Set the following fields:

    • Metric export scope: Time series per instance
    • Metric identifier: compute.googleapis.com/instance/network/received_bytes_count
    • Utilization target: TARGET_VALUE
    • Utilization target type: TARGET_TYPE
  8. Click Done.

  9. When you are finished, click Save.

gcloud

gcloud compute instance-groups managed set-autoscaling MIG_NAME \
  --max-num-replicas=MAX_INSTANCES \
  --min-num-replicas=MIN_INSTANCES \
  --update-stackdriver-metric=compute.googleapis.com/instance/network/received_bytes_count \
  --stackdriver-metric-utilization-target=TARGET_VALUE \
  --stackdriver-metric-utilization-target-type=TARGET_TYPE

API

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/autoscalers
{
"name": "AUTOSCALER_NAME",
"target": "https://www.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instanceGroupManagers/MIG_NAME",
"autoscalingPolicy": {
  "maxNumReplicas": MAX_INSTANCES,
  "minNumReplicas": MIN_INSTANCES,
  "customMetricUtilizations": [
    {
      "utilizationTargetType": "TARGET_TYPE",
      "utilizationTarget": TARGET_VALUE,
      "metric": "compute.googleapis.com/instance/network/received_bytes_count"
    }
  ]
}
}

Autoscale based on memory usage

To configure autoscaling based on the percent of used memory, specify the memory/percent_used metric and filter it to use only the used memory state. If you do not specify the filter, then the autoscaler takes the sum of memory usage by all memory states labeled as buffered, cached, free, slab, and used.

Console

  1. In the Google Cloud console, go to the Instance groups page.

    Go to Instance groups

  2. Click the name of your MIG from the list to open the instance group overview page.

  3. On the instance group overview page, click Edit.

  4. If no autoscaling configuration exists:

    1. Under Autoscaling, click Configure autoscaling.
    2. Under Autoscaling mode, select On: add and remove instances to the group to enable autoscaling.
  5. Under Autoscaling, in the Autoscaling metrics section, click Add metric.

  6. Set the Metric type to Cloud Monitoring metric.

  7. Set the following fields:

    • Metric export scope: Time series per instance
    • Metric identifier: agent.googleapis.com/memory/percent_used
    • Additional filter expression: metric.labels.state="used"
    • Utilization target: TARGET_VALUE
    • Utilization target type: GAUGE
  8. Click Done.

  9. When you are finished, click Save.

gcloud

gcloud compute instance-groups managed set-autoscaling MIG_NAME \
  --max-num-replicas=MAX_INSTANCES \
  --min-num-replicas=MIN_INSTANCES \
  --update-stackdriver-metric=agent.googleapis.com/memory/percent_used \
  --stackdriver-metric-filter="metric.labels.state = \"used\""
  --stackdriver-metric-utilization-target-type=gauge \
  --stackdriver-metric-utilization-target=TARGET_VALUE  \

API

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/autoscalers
{
"name": "AUTOSCALER_NAME",
"target": "https://www.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instanceGroupManagers/MIG_NAME",
"autoscalingPolicy": {
  "maxNumReplicas": MAX_INSTANCES,
  "minNumReplicas": MIN_INSTANCES,
  "customMetricUtilizations": [
  {
    "filter": "metric.labels.state=\"used\"",
    "utilizationTargetType": "GAUGE",
    "utilizationTarget": TARGET_VALUE,
    "metric": "agent.googleapis.com/memory/percent_used"
  }
  ]
}
}

Autoscale based on disk I/O

Configure autoscaling based on the total count of disk I/O operations:

Console

  1. In the Google Cloud console, go to the Instance groups page.

    Go to Instance groups

  2. Click the name of your MIG from the list to open the instance group overview page.

  3. On the instance group overview page, click Edit.

  4. If no autoscaling configuration exists:

    1. Under Autoscaling, click Configure autoscaling.
    2. Under Autoscaling mode, select On: add and remove instances to the group to enable autoscaling.
  5. Under Autoscaling, in the Autoscaling metrics section, click Add metric.

  6. Set the Metric type to Cloud Monitoring metric.

  7. Set the following fields:

    • Metric export scope: Time series per instance
    • Metric identifier: agent.googleapis.com/disk/operation_count
    • Utilization target: TARGET_VALUE
    • Utilization target type: TARGET_TYPE
  8. Click Done.

  9. When you are finished, click Save.

gcloud

gcloud compute instance-groups managed set-autoscaling MIG_NAME \
  --max-num-replicas=MAX_INSTANCES  \
  --min-num-replicas=MIN_INSTANCES  \
  --update-stackdriver-metric=agent.googleapis.com/disk/operation_count \
  --stackdriver-metric-utilization-target=TARGET_VALUE  \
  --stackdriver-metric-utilization-target-type=TARGET_TYPE

API

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/autoscalers
{
"name": "AUTOSCALER_NAME",
"target": "https://www.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instanceGroupManagers/MIG_NAME",
"autoscalingPolicy": {
  "maxNumReplicas": MAX_INSTANCES,
  "minNumReplicas": MIN_INSTANCES,
  "customMetricUtilizations": [
  {
    "utilizationTargetType": "TARGET_TYPE",
    "utilizationTarget": TARGET_VALUE,
    "metric": "agent.googleapis.com/disk/operation_count"
  }
  ]
}
}

Autoscale based on size of another MIG

You can autoscale a MIG based on the size of another MIG. For example, you can have multi-tier application with a frontend MIG that autoscales based on a load balancer and a backend MIG that autoscales proportionally to the frontend. Use a single instance assignment to define how many backend VMs are needed for every frontend VM. If you set the single instance assignment to 0.25, then the autoscaler will keep 1 backend VM for every 4 frontend VMs.

To autoscale a MIG (MIG_1) based on the size of another MIG (MIG_2):

Console

  1. In the Google Cloud console, go to the Instance groups page.

    Go to Instance groups

  2. Click the name of your MIG from the list to open the instance group overview page.

  3. On the instance group overview page, click Edit.

  4. If no autoscaling configuration exists:

    1. Under Autoscaling, click Configure autoscaling.
    2. Under Autoscaling mode, select On: add and remove instances to the group to enable autoscaling.
  5. Under Autoscaling, in the Autoscaling metrics section, click Add metric.

  6. Set the Metric type to Cloud Monitoring metric.

  7. Set the following fields:

    • Metric export scope: Single time series per group
    • Metric identifier: compute.googleapis.com/instance_group/size
    • Monitored resource type: instance_group
    • Additional filter expression: resource.labels.location = "ZONE|REGION" AND resource.labels.instance_group_name = "MIG_2"

      ZONE|REGION: The zone or region location of MIG_2

    • Scaling policy: Single instance assignment

    • Single instance assignment: 0.25

  8. Click Done.

  9. When you are finished, click Save.

gcloud

gcloud compute instance-groups managed set-autoscaling MIG_1 \
  --max-num-replicas=MAX_INSTANCES \
  --min-num-replicas=MIN_INSTANCES \
  --update-stackdriver-metric=compute.googleapis.com/instance_group/size \
  --stackdriver-metric-filter="resource.type = \"instance_group\" AND resource.labels.location = \"ZONE|REGION\" AND resource.labels.instance_group_name = \"MIG_2\"" \
  --stackdriver-metric-single-instance-assignment=0.25

API

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/autoscalers
{
"name": "AUTOSCALER_NAME",
"target": "https://www.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instanceGroupManagers/MIG_1",
"autoscalingPolicy": {
  "maxNumReplicas": MAX_INSTANCES,
  "minNumReplicas": MIN_INSTANCES,
  "customMetricUtilizations": [
  {
    "singleInstanceAssignment": 0.25,
    "filter": "resource.type = \"instance_group\" and resource.labels.location = \"ZONE|REGION\" AND resource.labels.instance_group_name = \"MIG_2\"",
    "metric": "compute.googleapis.com/instance_group/size"
  }
  ]
}
}

Monitoring metrics concepts

This section provides a brief description of the Monitoring metrics concepts that you need to know while configuring autoscaling based on Monitoring metrics.

  • Metric identifier or Metric URL: The metric name in the form of a protocol-free URL. You can find URLs for built-in metrics URL from the metric list.

    For example, the URL of a Pub/Sub metric that gives the number of unacknowledged messages is pubsub.googleapis.com/subscription/num_undelivered_messages.

  • Monitored resource type: The source of metric value. You can find the monitored resource type of a metric from the metric list.

    For example, the monitored resource type of the pubsub.googleapis.com/subscription/num_undelivered_messages metric is pubsub_subscription. For more details about each monitored resource type, see Monitored resource types.

  • Metric filter: When your metric has multiple values, a filter enables the autoscaler to identify a specific metric value from the set of possible metric values. Use the labels defined on a metric and a monitored resource type to filter the values. If you want to explore your metric values with different filters, you can try them in the metrics explorer.

    For example, the following screenshot shows pubsub.googleapis.com/subscription/num_undelivered_messages metric that gives the number of unacknowledged messages in all available subscriptions. Each line on the chart indicates a subscription.

    Metric explorer showing metric values without filter.

    Without a filter, the autoscaler takes the sum of metric values from all subscriptions. To autoscale based on a single subscription, apply a filter on thesubscription_id label defined for the pubsub_subscription. The following screenshot shows a single subscription after applying the filter.

    Metric explorer showing filtered metric value.

Metric filtering requirements

When you use a metric that has multiple values (categorized using labels), you can apply a filter to autoscale based on specific values from the metric. If the filter returns multiple values, then the values are added together. For best results, create a filter that is specific enough to return a single value.

Autoscaler filtering for metrics is compatible with the Monitoring filter syntax. The filter must meet the following requirements:

  • You must wrap the value of a filter in double quotes.
  • You must use the direct equality comparison operator (=).
  • You must use the AND operator to join different filter criteria.

    For example: --stackdriver-metric-filter="resource.type=\"pubsub_subscription\" AND resource.labels.subscription_id=\"our-subscription\"".

  • You must use direct values. You cannot use wildcards or functions in the filter.

    For example, you cannot use resource.labels.zone = starts_with("us-").

  • You cannot use resource metadata labels that store metadata about a monitored resource.

For a full reference of metric labels and monitored resource labels that you can filter on, see metrics list and monitored resources list.

Custom metric requirements

To use custom metrics, you must first create a custom metric. For information about creating a custom metric, see Using custom metrics.

A custom metric used for autoscaling must have the following properties:

  • If the autoscaling configuration uses data from each VM in the group, set up instances in your MIG so that each VM exports the custom metric. The exported values from each VM must be associated with a gce_instance monitored resource, which contains the following labels:
    • zone with the name of the zone the instance is in.
    • instance_id with the value of unique numerical ID assigned to the VM.
  • The metric must export data at least every 60 seconds. If you export data more often than every 60 seconds, the autoscaler can respond to load changes more quickly. If you export your data less frequent than every 60 seconds, the autoscaler might not respond to load changes quickly enough.
  • The metric must export int64 or double data values.

What's next