Regional MIGs' target distribution shape

By default, a regional managed instance group (MIG) distributes its managed instances evenly across selected zones. But if you need hardware that is not available in all zones or that might be temporarily unavailable in selected zones, or if you need to prioritize the use of zonal reservations, you might prefer a different distribution.

To configure how a regional MIG distributes your managed instances across selected zones within a region, set the MIG's target distribution shape. The following options are available:

  • EVEN (default): the group schedules VM instance creation and deletion to achieve and maintain an even number of managed instances across the selected zones. The distribution is even when the number of managed instances does not differ by more than 1 between any two zones. Recommended for highly available serving workloads.

  • BALANCED: the group prioritizes acquisition of resources, scheduling VMs in zones where resources are available while distributing VMs as evenly as possible across selected zones to minimize the impact of zonal failure. Recommended for highly available serving or batch workloads that do not require autoscaling.

  • ANY: the group picks zones for creating VM instances to fulfill the requested number of VMs within present resource constraints and to maximize utilization of unused zonal reservations. Recommended for batch workloads that do not require high availability.

Choose an option based on your workload requirements and which MIG capabilities you need. See the comparison table, use cases, and how distribution shapes work.

Comparison of shapes

For each possible target shape, the following table describes its intended workloads, purpose, distribution of managed instances, feature support, and a brief description of MIG behavior when faced with unavailable resources.

EVEN (default) BALANCED ANY
Intended workloads Highly available serving workloads (stateless or stateful) Highly available serving workloads (stateless or stateful)

Highly available batch workloads
Batch workloads
Purpose Minimize the impact of zone-level failure, assuming sufficient availability of resources in each zone. Minimize the impact of zone-level failure as much as possible considering availability of resources in each zone. Prioritize resource acquisition and utilization of unused reservations.
Target distribution of managed instances across zones Even.

The number of managed instances does not differ by more than 1 between any two zones, regardless of resource availability.*

Some managed instances might not be up and running in case of zonal capacity constraints.
As even as possible.

No guarantees on discrepancies in the number of VMs across zones, which depends on current resource availability.

When resources are available, the distribution is similar to EVEN. In the worst case of resource constraints, the distribution can take any shape.
Any.

Each zone can have a different number of managed instances (including all or none).
MIG feature support EVEN (default) BALANCED ANY
Autoscaling
Canary updates
Proactive instance redistribution Not applicable
Reservations Maximally utilized within each zone independently.

Reservations do not impact how instances are distributed.
Maximally utilized within each zone independently.

If reservations are present they might help arriving at a balanced distribution.
Maximally utilized within the region.

The group prioritizes using up reservations in the region.
Instance template
hardware requirements (machine type, CPU, GPU, existing disks)
Selected hardware must be available in all selected zones. Selected hardware must be available in at least one selected zone. Selected hardware must be available in at least one selected zone.
Handling failures EVEN (default) BALANCED ANY
Temporary unavailability of resources in a zone Exposed

Creates new managed instances in zones with fewer managed instances. Keeps retrying to create VM instances in a zone where resources are unavailable until it succeeds.

Risk: Cannot create VMs in a zone with limited resources.
Resilient

Creates new managed instances in zones where resources are available, while distributing instances as evenly as possible across zones.

Risk: VMs might not be distributed evenly across zones.
Resilient

Creates new managed instances in zones where resources are available and to maximize utilization of unused reservations.

Risk: VMs might not be distributed evenly across zones.
Zone-level failure Resilient

Impact is minimized because instances in healthy zones keep serving.

Impact is further minimized if you provision extra instances, sufficient to tolerate losing one zone.
Resilient

Impact is minimized because instances in healthy zones keep serving.

Impact is further minimized if you provision extra instances, sufficient to tolerate losing one zone.
Exposed

Outage might happen if the majority or all instances are concentrated in a failed zone.

*If you configure load balancing as well as autoscaling, and if a zone fails, you might see more VMs in zones where load grows. If you disable proactive instance redistribution and add or remove instances from zones, you might see an uneven distribution.

Use cases

Review the feature support, then choose a distribution shape based on your use case.

Prioritize workload resilience with an even distribution

If you run a highly available serving application that must survive zone-level failure without degraded performance, use the EVEN target distribution shape with an over-provisioned group size. Overprovisioning the number of instances in a group protects your workload from zone-level failure.

Depending on your workload, consider creating an autoscaler to automatically add or remove instances to your group when load increases or decreases.

To learn more about the EVEN target distribution shape, see the comparison of target shapes and read How the EVEN target shape works.

For more information about deploying highly available workloads on regional MIGs, see the following sections:

Balance resource acquisition with an even distribution

If you run a highly available serving or batch workload and need to balance acquisition of resources against an even distribution of VM instances across selected zones in a region, use the BALANCED target distribution shape.

The BALANCED shape prioritizes acquisition of resources—the group creates instances in zones where resources are available—while distributing instances as evenly as possible across zones to minimize the impact of zone-level failure.

If you run a batch workload that doesn't need to be protected against zone-level failure, use the ANY target shape instead. The ANY shape prioritizes acquisition of resources as well as use of zonal reservations.

With the shape set to BALANCED or to ANY, you don't need to manually verify whether specific hardware is available in a particular zone. You can select all zones in a region and the group automatically deploys instances in zones where your required hardware is available.

To learn more about the BALANCED target distribution shape, see the comparison of target shapes and read How the BALANCED target distribution shape works.

Prioritize resource acquisition

If you run batch workloads and if getting the requested number of instances to perform the processing is more important for you than workload resilience to zone-level failures, use the ANY target distribution shape.

If you have matching reservations, set your target shape to ANY to prioritize the use of zones that contain the matching reservations. To learn how to configure reservations in an instance template, see How reservations work.

Similar to the BALANCED target shape, the ANY shape is useful when your batch workload requires any of the following features:

  • VMs with special hardware, such as a specific CPU platform or GPU model. The group will deploy instances to the zones that support the requested hardware, according to resource availability, and with a preference for zones that have matching reservations.
  • Preemptible VMs. You won't need to explore which zones have preemptible capacity available. The group will deploy to zones with preemptible capacity automatically.
  • VMs with a large number of cores. The group will get large machines where they are available, with a preference for zones that have matching reservations.

You don't need to manually verify whether specific hardware is available in a particular zone. You can select all zones in a region and the group automatically deploys instances in zones where your required hardware is available.

You can selectively delete batch job worker instances that have completed calculations without affecting other workers. Unlike a group with an EVEN target shape and proactive redistribution, a group with ANY target shape doesn't have to achieve an even balance and won't trigger redistribution.

To learn more details about ANY target distribution, see the comparison of target shapes and read how the ANY target distribution shape works.

How it works

This section describes how each target distribution shape works in the following situations:

  • When you resize the MIG
  • In case resources are temporarily unavailable in a zone
  • In case of zonal failure

The EVEN distribution shape

With a target distribution shape set to EVEN and proactive redistribution enabled, the number of managed instances in a regional MIG does not differ by more than 1 between any two zones, regardless of resource availability. But a managed instance might not be up and running if its zone lacks the resources to provision an actual VM.

Resizing a MIG that has an EVEN distribution

A group with an EVEN target shape picks zones for adding or deleting instances in a way that preserves or converges to an even balance of managed instances across zones.

For example, the following diagram shows how a group adds and removes managed instances.

The even target shape evenly adds and removes instances across zones.
Resizing a MIG that has an EVEN distribution

Impact of temporarily unavailable resources

Resources might be temporarily unavailable in a zone when you create the group or increase the number of instances. For example, if you request preemptible instances or specialized hardware in a limited supply, those resources might not be available at the time of your request.

With the goal of maintaining an even distribution of instances across zones, the group continues attempting to create VM instances in zones where the resources are temporarily unavailable. Eventually, the group does acquire the full number of running VM instances after the resources become available.

For example, the following diagram shows what happens if one of the zones cannot fulfill your request due to a temporary unavailability of resources.

With an even target shape, if VMs are not available, autohealing continuously attempts to create them until they are available.
Impact of temporarily unavailable resources on a MIG that has an EVEN distribution

Impact of zone-level failure

If you use the EVEN (or BALANCED) target distribution shape, you can provision extra instances to minimize impact of a zone-level failure.

In case of zone-level failure, a regional MIG that is deployed to 3 zones with an EVEN (or BALANCED) target distribution shape might lose 1/3 of its instances. You can ensure sufficient capacity to serve your load in case of zone-level failure by provisioning more VMs, 2/3 of which are required by the load.

For example, if you require 8 instances to process requests across 3 zones and you want to protect your workload against zone-level failure, you should create a regional group with 12 instances. The following diagram shows what happens if one zone fails.

With an even target shape, overprovisioning the MIG maintains a sufficient number of VMs in case of zonal failure.
Impact of zonal failure on a MIG that has an EVEN distribution

The EVEN target distribution shape works well with autoscaling and load balancing under such circumstances. In case of a zone-level failure, the load balancer starts sending traffic to instances in the two remaining zones to accommodate traffic from the failed zone.

For more information about how a regional MIG works with an autoscaler, see Autoscaling a regional managed instance group.

The BALANCED distribution shape

A regional MIG with a BALANCED target shape might not achieve an even distribution across zones, specifically when the requested resources are not available in a zone.

The MIG prioritizes provisioning the requested number of VMs by creating VMs in zones where resources are available. When resources are available, the distribution is similar to EVEN. In the worst case of resource constraints, the distribution can take any shape.

Resizing a MIG that has a BALANCED distribution

Increasing group size

With a BALANCED target shape, the group chooses zones for creating new instances based on the current availability of the resources that you specified in the MIG's instance template.

  • When resources are sufficiently available in all selected zones, the group maintains an even distribution across zones on size increases, the same way as the EVEN target shape.
  • When zonal capacity constraints make it impossible to achieve an even distribution, the group creates instances in the zones where resources are available, while still trying to maximize balance.

For example, you might observe capacity constraints and an uneven distribution if you request a specialized CPU platform, GPU model, or preemptible VMs that are not uniformly available in all zones.

The balanced target shape adds and removes instances as evenly as possible across zones based on current capacity.
Resizing a MIG that has a BALANCED distribution

Decreasing group size

When decreasing its size, a Regional MIG with a BALANCED target shape removes instances in the following sequence in order to limit disruption to your workload:

  1. Instances that are not running; that is, instances that for any reason could not be created or are being created or autohealed.
  2. Instances in zones where the group has more VMs, to converge to an evenly distributed state.

Impact of temporarily unavailable resources or zonal failure

With a BALANCED target distribution shape, the group deploys instances to zones where capacity is available. During temporary zonal capacity constraints, this can lead to an uneven distribution of instances across zones.

If in such a situation a zone with the largest number of VM instances fails, your workload might lose a significant share of your serving capacity. If the healthy zones have temporary capacity constraints, the group tries to recreate failed instances in the original location (a failed zone) and this attempt might fail.

To protect your workload against such an extreme case:

  • Overprovision the size of your regional MIG, so that your workload has sufficient serving capacity in the case of a zonal failure.
  • Reserve a sufficient amount of resources in each zone to cover peak load, to overprovision, and to maintain an even distribution across zones. This tactic helps ensure that you can get an even distribution of instances across zones, which minimizes capacity loss in case of a zonal failure.

The following diagram shows how a scenario with temporary zonal capacity constraints, followed by a zonal failure, might evolve.

With a balanced target shape, if VMs are not available the distribution can be uneven. In case of a subsequent zonal failure, autohealing continuously attempts to create failed VMs until they are available.
Impact of temporarily unavailable resources, followed by a zonal failure, on a MIG that has a BALANCED distribution

If your request cannot be fulfilled in any zone in the region, the group schedules instance creation in zones with temporarily unavailable resources. The group continues attempting to create the scheduled instances within the zones where their creation was originally scheduled. If the resources become available in other zones sooner than in the original zone where a VM was scheduled, the group won't try creation in those other zones. You can schedule new instances in zones with available capacity manually by deleting the managed instances that failed to create and resizing the group up to its target size.

If instance creation is unsuccessful, you can list managed instances to review the error message in the corresponding managed instance or list recent errors.

The ANY distribution shape

With a target distribution shape set to ANY, a regional MIG prioritizes resource acquisition by creating managed instances in zones where resources are available. This means that all of the instances might be created in one zone, or evenly distributed across all zones, or anything between those two scenarios.

Resizing a MIG that has an ANY distribution

When you increase group size, the group picks any zone where capacity is available. If you have matching reservations in one or more zones, the group prioritizes utilization of those reservations.

When you decrease group size, the group deletes the VM instances in the following order:

  1. VMs that are not running for any reason
  2. VMs that are not yet updated to the intended version
  3. VMs chosen nondeterministically

If you need to decrease group size in specific zones or remove specific VM instances, for example workers that finished their job, you can delete individual instances from the group.

Impact of temporarily unavailable resources

With a target distribution shape set to ANY, the group schedules VM instance creation in zones where the requested resources are available and avoids zones with temporarily unavailable resources.

If your request cannot be fulfilled in any zone in the region, the group schedules instance creation in zones with temporarily unavailable resources. The group will keep trying to create the scheduled instances within the zones where their creation was originally scheduled. If the resources become available in other zones sooner than in the original zone where a VM was scheduled, the group won't try creation in those other zones. You can manually schedule new instances in zones with available capacity by deleting the non-running managed instances and resizing the group up to its target size.

If instance creation is unsuccessful, you can list managed instances to review the error message in the corresponding managed instance or list recent errors.

For example, the following diagram shows how a regional group schedules instances when a zone cannot fulfill your request.

With a target distribution shape set to ANY, the group creates VMs in zones where the requested resources are available and avoids zones with temporarily unavailable resources.
Impact of temporarily unavailable resources on a MIG that has an ANY distribution

Impact of zone-level failure

With its target distribution shape set to ANY, the group might deploy the majority or all of its instances in a single zone. In the event of failure in that zone, most or all of the group's instances could become unavailable for the duration of the failure.

In case of a zone-level failure or resources becoming temporarily unavailable, or when for any reason your VM instances are not running, you can delete the individual non-running instances then resize the group back to its necessary size in order to try to get replacement instances in zones with available capacity.

With a target distribution shape set to ANY, the group creates VMs in zones where the requested resources are available. If resources are not available for any reason, you can decrease the size of the group, then increase the size of the group to try to get the VMs in a different zone.
Deleting and recreating instances in a MIG that has an ANY distribution, in case of temporarily unavailable resources

What's next