About Flex-start VMs

This document provides an overview of Flex-start VMs, detailing their key characteristics, as well as the requirements and limitations that you apply when you create them.

Flex-start VMs are virtual machine (VM) instances that you create by using the flex-start provisioning model. This model uses the Dynamic Workload Scheduler (DWS) to provision discounted compute resources from a secure pool of capacity, improving your chances of obtaining high-demand resources like GPUs. After you create Flex-start VMs, Compute Engine attempts to allocate your requested resources within a specific timeframe. If it succeeds, then your Flex-start VMs start running and keep running for a maximum of seven days.

For workloads that require resources for longer than seven days, or with a higher capacity assurance, you can create a future reservation request in calendar mode to still benefit from DWS discounts.

Flex-start VMs use cases

Flex-start VMs are ideal for running workloads that can start at any time, such as the following:

Small model pre-training
Model fine-tuning
High performance computing (HPC) simulation
Batch inference

Flex-start VMs key characteristics

Compared to other types of Compute Engine instances, Flex-start VMs have the following characteristics:

A wait time for allocating resources: you can create Flex-start VMs before Compute Engine can allocate the requested resources. However, VMs only start if resources become available within your specified timeframe. If resources are not available, then the VM creation request fails

For more information, see Flex-start VM wait time in this document.
A limited run duration: Flex-start VMs can run for up to seven days. After that time, Compute Engine automatically stops or deletes the VMs based on the termination action that is specified in the VM properties.

For more information, see Flex-start VM limited run duration in this document.
The flex-start provisioning model: you create Flex-start VMs by using the flex-start provisioning model. This provisioning model provides improved resource availability and discounted prices compared to VMs that you create by using the standard provisioning model.

For more information about each provisioning model, see Compute Engine instances provisioning models.

Flex-start VM wait time

When you create a Flex-start VM, the VM doesn't immediately start. Compute Engine attempts to allocate your requested resources and start the VM within a specific timeframe. If you have sufficient quota for your requested resources and Compute Engine allocates them by the end of the wait time, then the Flex-start VM starts within two minutes of capacity becoming available. Otherwise, the VM creation request fails.

The wait time varies based on the method that you use to create VMs:

Standalone Flex-start VMs wait time
MIG resize requests wait time

Standalone Flex-start VMs wait time

To create a standalone Flex-start VM, you must specify a wait time by using the requestValidForDuration field. You can set a wait time of either zero seconds, or between 90 seconds and 7,200 seconds (two hours).

Based on your workload's zonal requirements, we recommend the following wait times to help increase the chances that your Flex-start VM creation request succeeds:

Strict zonal requirements: if your workload requires you to create a Flex-start VM in a specific zone, then we recommend that you set the requestValidForDuration field to 90 seconds or higher, up to two hours. Longer wait times help increase your chances of obtaining resources. The VM remains in the PENDING state throughout this time.
No zonal requirements: if the Flex-start VM can run in any zone in the region, then we recommend that you set the requestValidForDuration field to zero seconds. This value specifies that Compute Engine only allocates resources if they are immediately available. If your request fails because resources are unavailable, then try creating the Flex-start VM in a different zone.

To stop a VM creation request while Compute Engine attempts to allocate resources, delete the Flex-start VM.

MIG resize requests wait time

If you add Flex-start VMs all at once to a managed instance group (MIG) by using resize requests, then the wait time to provision all your requested resources is indefinite. After you create a MIG resize request, the request remains in the ACCEPTED state until resources become available. If and when your requested resources become available, the MIG resize request state changes to SUCCEEDED and Compute Engine creates the Flex-start VMs.

To stop a VM creation request while Compute Engine attempts to allocate resources, cancel the MIG resize request. For more information, see About MIG resize requests.

Flex-start VM limited run duration

When you create a Flex-start VM, you must specify the following:

The VM run duration: you must specify how long the VMs can run. The run duration can be between 10 minutes and 7 days. If you no longer need the VMs, then you can optionally stop or delete the standalone VMs, or delete the VMs created by using a MIG resize request.
The VM termination action: you must choose whether Compute Engine automatically stops or deletes the VMs at the end of their run duration.

Important: After Compute Engine stops a VM and changes its state to STOPPING, you keep incurring charges for any resources that are attached to the VM, such as disks or GPUs. To avoid unnecessary costs, detach and delete any resources that you no longer need, or delete the VM altogether. For more information, see the pricing for a VM's uptime.

Quota

To create or restart a Flex-start VM, you must have sufficient preemptible quota for the requested vCPUs, memory, and any attached GPUs or Local SSD disks.

If you attempt to create or restart a Flex-start VM without sufficient quota, then one of the following occurs:

VM creation requests: Your request remains pending until you acquire sufficient quota. If you don't acquire the required quota before the wait time ends, then your request fails.
VM restart requests: your request fails immediately.

Pricing

For Flex-start VMs, you incur charges as follows:

You pay as you go (PAYG). For more information about a VM pricing during its lifecycle, see Pricing.
For A4, A3, A2, and H4D machine types, you obtain vCPUs, memory, and any attached GPUs at a discounted price. Other supported accelerator-optimized machine types aren't eligible for discounts. For more information, see DWS pricing.

Limitations

Flex-start VMs have the following limitations:

Flex-start VMs can only use the following machine types:
- Any accelerator-optimized machine type, except A4X and G4
- H4D machine types
You must create Flex-start VMs by using the flex-start provisioning model.
You must specify whether to stop or delete Flex-start VMs at the end of their run duration by using the instanceTerminationAction and maxRunDuration fields.
You must stop Flex-start VMs during host maintenance events.
You can't apply placement policies to Flex-start VMs.
You can't use reservations.

What's next

To learn how to create a standalone Flex-start VM, see Create a Flex-start VM.
To learn more about creating multiple Flex-start VMs at once in a MIG, see About MIG resize requests.

Try it for yourself

If you're new to Google Cloud, create an account to evaluate how Compute Engine performs in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

Try Compute Engine free