Scaling Based on Stackdriver Monitoring Metrics

You can set up autoscaler to scale based on a standard metric provided by the Stackdriver Monitoring service, or a custom Stackdriver Monitoring metric.

Before you begin

Before you begin

If you haven't already, review the Before you begin section for important setup steps.

Standard metrics

Stackdriver Monitoring has a set of standard metrics that you can use to monitor your virtual machine instances. However, not all standard metrics are a valid utilization metric that the autoscaler can use to scale.

A valid utilization metric for scaling meets the following criteria:

  1. The standard metric must contain data for a gce_instance monitored resource. You can use the timeSeries.list API call to verify whether a specific metric exports data for this resource.

  2. The standard metric describes how busy an instance is, and the metric value increases or decreases proportionally to the number virtual machine instances in the group.

    The following is an invalid metric because the value does not change based on utilization and the autoscaler cannot use the value to scale proportionally:

Once you select a standard metric you want to use for autoscale, you can enable autoscaling using that metric.

Custom metrics

You can create custom metrics using Stackdriver Monitoring and write your own monitoring data to the Stackdriver Monitoring service. This gives you side-by-side access to standard Cloud Platform data and your custom monitoring data, with a familiar data structure and consistent query syntax. If you have a custom metric, you can choose to scale based on the data from these metrics.


In order to use custom metrics, you must have done the following:

  1. Created a custom metric. For information on creating a custom metric, see the Custom Metrics documentation.
  2. Set up your managed instance group to export the custom metric from all instances in the managed instance group.

Choose a valid custom metric

Not all custom metrics can be used by the autoscaler. To choose a valid custom metric, the metric must have all of the following properties:

  • The metric must be a per-instance metric. The metric must export data relevant to each specific Compute Engine instance separately.
  • The exported per-instance values must be associated with a gce_instance
    monitored resource, which contains the following labels:
    • zone with the name of the zone the instance is in.
    • instance_id with the value of unique numerical ID assigned to the instance.
  • The metric must export data at least every 60 seconds. You can export data more often than 60 seconds and the autoscaler will be able to respond faster to load changes. If you export your data less than every 60 seconds, the autoscaler might not be able to respond quickly enough to load changes.
  • The metric must be a valid utilization metric, which means that data from the metric can be used to proportionally scale up or down the number of virtual machines.
  • The metric must export int64 or double data values.

For autoscaler to work with your custom metric, you must export data for this custom metric from all the instances in the managed instance group.

Note: You can get an instance's numerical ID by making a request for the metadata server's ID property from within the instance. For example, you can do this in curl:
curl -H Metadata-Flavor:Google
For more information on using the metadata server, see Metadata Server.

Enable autoscaling using monitoring metrics

The process of setting up an autoscaler for a standard or custom metric is the same. To create an autoscaler that uses Stackdriver Monitoring metrics, you must provide the desired target utilization level, the custom metric name, and the utilization target type. Each of these properties are described briefly below:

  • Target type: This defines how the autoscaler computes the data collected from the instances. The possible target types are:

    • GAUGE: The autoscaler computes the average value of the data collected in last couple minutes to the target utilization value of the autoscaler.
    • DELTA_PER_MINUTE: The autoscaler calculates the average rate of growth per minute and compares that to the target utilization.
    • DELTA_PER_SECOND: The autoscaler calculates the average rate of growth per second and compares that to the target utilization.

    If you expressed your desired target utilization in seconds, you will want to use DELTA_PER_SECOND and likewise, use DELTA_PER_MINUTE if you expressed your target utilization in minutes, so the autoscaler can perform accurate comparisons.

  • Custom metric name: The name of the custom metric to use. You defined this name when initially creating the metric, in the format:

    See Creating a metric descriptor for more information.

  • Target utilization level: The target utilization level that this autoscaler should maintain for this metric. This must be a positive number. For example, both 24.5 and 1100 are acceptable values. Note this is different from CPU and load balancing utilization, which must be a float between 0.0 and 1.0.


  1. Go to the Instance Groups page.
  2. If you do not have an instance group, create one. Otherwise, click on an instance group from the list.
  3. On the instance group details page, click the Edit Group button.
  4. Under Autoscaling, select On from the drop-down menu to turn on autoscaling.
  5. In the Autoscale based on section, select Monitoring metric.
  6. Enter the metric name, the target value, and the type of metric in the respective text boxes. Keep in mind that your metric name will be in the format:
  7. Save your changes once you're ready.


For example, in gcloud, the following command creates an autoscaler that uses the GAUGE target type. Along with the --custom-metric-utilization parameter, the --max-num-replicas parameter is also required when creating an autoscaler:

gcloud compute instance-groups managed set-autoscaling example-managed-instance-group \
    --custom-metric-utilization,utilization-target-type=GAUGE,utilization-target=10 \
    --max-num-replicas 20 \
    --cool-down-period 90

Optionally, you can use the --cool-down-period flag, which tells the autoscaler how many seconds to wait after a new virtual machine has started before the autoscaler starts collecting usage information from it. This accounts for the amount of time it might take for the virtual machine to initialize, during which the collected usage is not reliable for autoscaling. The default cool down period is 60 seconds.

To see a full list of available gcloud commands and flags, see the gcloud reference.


Note: Although autoscaling is a feature of managed instance groups, it is a separate API resource. Keep that in mind when you construct API requests for autoscaling.

In the API, make a POST request to the following URL, replacing myproject with your own project ID and us-central1-f with the zone of your choice:


Your request body must contain the name, target and, autoscalingPolicy fields. In autoscalingPolicy, provide the maxNumReplicas, and the customMetricUtilizations properties

Optionally, you can use the coolDownPeriodSec parameter, which tells the autoscaler how many seconds to wait after a new instance has started before it starts to collect usage. After the cool-down period passes, the autoscaler begins to collect usage information from the new instance and determines if the group requires additional instances. This accounts for the amount of time it might take for the instance to initialize, during which the collected usage is not reliable for autoscaling. The default cool-down period is 60 seconds.


 "name": "example-autoscaler",
 "target": "zones/us-central1-f/instanceGroupManagers/example-managed-instance-group",
 "autoscalingPolicy": {
  "maxNumReplicas": 10,
  "coolDownPeriodSec": 90,
  "customMetricUtilizations": [
    "metric": "some/metric/name",
    "utilizationTarget": 10,
    "utilizationTargetType": "GAUGE"
  "coolDownPeriodSec": 90

Send feedback about...

Compute Engine Documentation