Regional managed instance groups

You can use a regional managed instance group (MIG) to increase the resilience of your MIG-based workload. A regional MIG distributes your virtual machine (VM) instances across multiple zones in a region, which protects you from extreme cases where all instances in a single zone fail.

This page contains conceptual information about regional MIGs:

To learn how to create a regional MIG, see Creating and managing regional MIGs.

Why choose regional managed instance groups?

Google recommends regional MIGs over zonal MIGs for the following reasons:

  • You can use regional MIGs to manage up to 2,000 instances, twice as many as zonal MIGs.
  • You can use regional MIGs to spread your application load across multiple zones, rather than confining your application to a single zone or managing multiple zonal MIGs across different zones.

Using multiple zones protects against zonal failures and unforeseen scenarios where an entire group of instances in a single zone malfunctions. If that happens, your application can continue serving traffic from instances running in another zone in the same region.

In the case of a zonal failure, or if a group of instances in a zone stops responding, a regional MIG continues supporting your instances as follows:

  • The number of instances that are part of the regional MIG in the remaining zones continue to serve traffic. No new instances are added and no instances are redistributed (unless you set up autoscaling).

  • After the failed zone has recovered, the MIG starts serving traffic again from that zone.

When designing for robust and scalable applications, use regional MIGs.

Limitations

  • With a regional MIG, you can create up to 2,000 VMs in a region, with a maximum of 1,000 VMs per zone. With a zonal MIG, you can create up to 1,000 VMs. If you need more, contact support.
  • When updating a MIG, you can specify up to 1,000 VMs in a single request.
  • If you want a stateful MIG, review the stateful MIG limitations.

  • If you want to use load balancing with a regional MIG, the following limitations apply:

    • You cannot use the maxRate balancing mode.
    • If you use an HTTP(S) load balancing scheme with a regional MIG, you must choose the maxRatePerInstance or maxUtilization balancing mode.
  • If you want to autoscale a regional MIG, the following limitations apply:

    • To scale in and out, you must enable proactive instance redistribution. If you set the autoscaler's mode to only scale out, then you don't need to enable proactive instance distribution.
    • If you want to autoscale a regional MIG based on Cloud Monitoring metrics, the following limitations apply:

      • You cannot use per-group metrics.
      • You cannot apply filters to per-instance metrics.

Regional configuration options

Creating a regional MIG is similar to creating a zonal MIG, except that you have additional options:

  • You can select which zones within a region to create instances in.
  • You can choose how to distribute instances across selected zones.

These options are described below.

Zone selection

The default configuration for a regional MIG is to distribute its managed VMs evenly across three zones. For various reasons, you might want to select specific zones for your application. For example, if you require GPUs for your instances, you might select only zones that support GPUs. You might have persistent disks that are only available in certain zones, or you might want to start with instances in just a few zones, rather than in three random zones within a region.

If you want to choose the number of zones or choose the specific zones the group runs in, you must do that when you first create the group. After you choose specific zones during creation, you cannot change or update the zones later.

  • To select more than three zones within a region, you must explicitly specify the individual zones. For example, to select all four zones within a region, you must provide all four zones explicitly in your request. If you do not, Compute Engine selects three zones by default.

  • To select two or fewer zones in a region, you must explicitly specify the individual zones. Even if the region only contains two zones, you must still explicitly specify the zones in your request.

Regardless of whether you choose specific zones or select the region and allow Compute Engine to create instances in three zones within the region, by default, the new instances are distributed evenly across the zones.

To learn how to create a regional MIGs and select specific zones, see Creating a regional MIG.

Proactive instance redistribution

By default, a regional MIG attempts to maintain an even distribution of instances across zones in the region to maximize the availability of your application in the event of a zone-level failure.

If you delete or abandon instances from your group, causing uneven distribution across zones, the group proactively redistributes instances to reestablish an even distribution.

To reestablish an even distribution across zones, the group deletes instances in zones with more instances, and adds instances to zones with fewer instances. The group automatically picks which instances to delete.

Proactive redistribution reestablishes even distribution across zones.
Example of proactive redistribution

For example, suppose you have a regional MIG with 12 instances spread across 3 zones: a, b, and c. If you delete 3 managed instances in c, the group attempts to rebalance so that the instances are again evenly distributed across the zones. In this case, the group deletes 2 instances (one from a and one from b) and creates 2 instances in zone c, so that each zone has 3 instances and even distribution is achieved. There is no way to selectively determine which instances are deleted. The group temporarily loses capacity while the new instances start up.

To prevent automatic redistribution of your instances, you can turn off proactive instance redistribution.

Turning off proactive instance redistribution is useful when you need to:

  • Delete or abandon instances from the group without affecting other running instances. For example, you can delete a batch worker instance after job completion without affecting other workers.
  • Protect instances with stateful workloads from undesirable automatic deletion due to proactive redistribution.
Disabling proactive redistribution can affect capacity during a
            zonal failure.
Uneven distribution after disabling proactive redistribution

If you turn off proactive instance redistribution, a MIG does not proactively add or remove instances to achieve balance but still opportunistically converges toward balance during resize operations, treating each resize operation as an opportunity to balance the group. For example, when scaling in, the group automatically uses the rescaling as an opportunity to remove instances from bigger zones; when scaling out, the group uses the opportunity to add instances to smaller zones.

Behavior differences from zonal MIGs

The main difference between a zonal MIG and a regional MIG is that a regional MIG can use more than one zone.

Because a regional MIG's managed instances are distributed across zones within a region, the following MIG features behave a bit differently.

Autoscaling a regional MIG

Compute Engine offers autoscaling for MIGs, which allows your groups to automatically add instances (scale out) or remove instances (scale in) based on increases or decreases in load.

If you enable autoscaling for a regional MIG, the feature behaves as follows:

  • An autoscaling policy is applied to the group as a whole. For example, if you enable autoscaler to target 66% CPU utilization, the autoscaler tracks all instances in the group to maintain an average 66% utilization across all instances in all zones.

  • Autoscaling attempts to evenly distribute VMs across available zones. In general, the autoscaler keeps zones balanced in size by adding VMs to zones with fewer VMs and expecting that load will be redirected from zones with more VMs, for example, through a load balancer. We do not recommend configuring a custom load balancer that prefers one zone because this could cause unexpected behavior such as an uneven distribution of instances across zones or unutilized instances in other zones.

  • If your workflow uses instances evenly in 3 zones and a zone experiences a failure, or a group of instances within a zone fails, 1/3 of the capacity might be lost but 2/3 of the capacity remains in the other zones. We recommend that you overprovision your autoscaled regional MIG to avoid overloading surviving servers during the time a zone is lost.

  • If resources (for example, preemptible instances) are temporarily unavailable in a zone, the group continues to try to create those instances in that zone. After the resources become available again, the group acquires the desired number of running instances.

  • If load balanacing is enabled and if resources are unavailable in a zone causing higher utilization of existing resources in that zone, new instances might be created in zones with lower utilization rates, which can result in a temporary uneven distribution.

The autoscaler only adds instances to a zone up to 1/n of the specified maximum for the group, where n is the number of provisioned zones. For example, if you are using the default of 3 zones, and if 15 is the maxNumReplicas configured for autoscaling, the autoscaler can only add up to 1/3 * 15 = 5 instances per zone for the group. If one zone fails, the autoscaler only scales out to 2/3 of the maxNumReplicas in the remaining two zones combined.

Provisioning your autoscaler configuration

Similar to the advice on overprovisioning a regional MIG, you should overprovision your regional MIG's autoscaler configuration. Assuming your group uses 3 zones, configure autoscaling as follows:

  • The autoscaling utilization target is 2/3 of your desired utilization target.
  • To accommodate for the lowered utilization target, autoscaler adds more instances, so you should increase the maxNumReplicas to 50% more than the number you would have set without taking into account overprovisioning.

For example, if you expect that 20 instances can handle your peak loads and the target utilization is 80%, set the autoscaler to:

  • 2/3 * 0.8 = 0.53 or 53% for target utilization instead of 80%
  • 3/2 * 20 = 30 for max number of instances instead of 20

This setup helps ensure that in the case of a single-zone failure, your MIG will not run out of capacity because the remaining 2/3 of instances will be able to handle the increased load from the offline zone (since you lowered the target utilization well below its capacity). The autoscaler also adds new instances up to the maximum number of instances you specified to maintain the 2/3 utilization target.

However, you shouldn't rely solely on overprovisioning your MIG to handle increased load. As a best practice, Google recommends that you regularly load test your applications to make sure they can handle the increased utilization that might be caused by a zonal outage removing 1/3 of the instances.

For more information about autoscaling, see the Autoscaling overview.

Updating a regional MIG

If you want to roll out a new template to a regional MIG, see Updating a regional MIG.

If you want to add or remove instances in a MIG, the process is similar for regional and zonal MIGs. See Working with managed instances.

If you're interested in configuring stateful disks or stateful metadata in a MIG, see Configuring stateful MIGs.

How to increase availability by overprovisioning

A variety of events might cause one or more instances to become unavailable, and you can help mitigate this issue by using multiple Google Cloud services:

  • Use a regional MIG to distribute your application across multiple zones.
  • Use application-based autohealing to recreate instances with failed applications.
  • Use load balancing to automatically direct user traffic away from unavailable instances.

However, even if you use these services, your users might still experience issues if too many of your instances are simultaneously unavailable.

To be prepared for the extreme case where one zone fails or an entire group of instances stops responding, Google strongly recommends overprovisioning your MIG. Depending on your application needs, overprovisioning your group prevents your system from failing entirely if a zone or group of instances becomes unresponsive.

Google makes recommendations for overprovisioning with the priority of keeping your application available for your users. These recommendations include provisioning and paying for more instances than your application might need on a day-to-day basis. Base your overprovisioning decisions on application needs and cost limitations.

You can set your MIG's size when creating it, and you can add or remove instances after you've created it.

Alternatively, you can configure an autoscaler to automatically overprovision when adding and removing instances from the group based on the load.

Estimating the recommended group size

We recommend that you provision enough instances so that, if all of the instances in any one zone become unavailable, your remaining instances would still meet the minimum number of instances that you require.

Use the following table to determine the minimum recommended size for your group:

Number of zones Additional VM instances Recommended total VM instances
2 +100% 200%
3 +50% 150%
4 +33% 133%

Provisioning a regional MIG in three or more zones

When you create a regional MIG in a region with at least three zones, Google recommends overprovisioning your group by at least 50%. By default, a regional MIG creates instances in three zones. Having instances in three zones already helps you preserve at least 2/3 of your serving capacity, and if a single zone fails, the other two zones in the region can continue to serve traffic without interruption. By overprovisioning to 150%, you can ensure that if 1/3 of the capacity is lost, 100% of traffic is supported by the remaining zones.

For example, if you need 20 instances in your MIG across three zones, we recommend, at a minimum, an additional 50% of instances. In this case, 50% of 20 is 10 more instances, for a total of 30 instances in the group. If you create a regional MIG with a size of 30, the group distributes your VMs across the three zones, like so:

Zone Number of VM instances
example-zone-1 10
example-zone-2 10
example-zone-3 10

If any single zone fails, you still have 20 instances serving traffic.

Provisioning a regional MIG in two zones

To provision your instances in two zones instead of three, Google recommends doubling the number of instances. For example, if you need 20 instances for your service, distributed across two zones, we recommend that you configure a regional MIG with 40 instances, so that each zone has 20 instances. If a single zone fails, you still have 20 instances serving traffic.

Zone Number of VM instances
example-zone-1 20
example-zone-2 20

If the number of instances in your group is not easily divisible across two zones, Compute Engine evenly divides the group of VMs and randomly puts the remaining instances in one of the zones.

Provisioning a regional MIG in one zone

You can create a regional MIG with just one zone. This is similar to creating a zonal MIG.

Creating a single-zone regional MIG is not recommended because it offers the minimum guarantee for highly available applications. If the zone fails, your entire MIG is unavailable, potentially disrupting your users.

What's next