This document explains how to create a Flex-start virtual machine (VM) instance. Flex-start VMs run for up to seven days and help you acquire high-demand resources like GPUs at a discounted price. These features make Flex-start VMs a cost-effective solution for running short-duration workloads, such as model fine-tuning and batch inference.
To learn more about the key characteristics of Flex-start VMs, including the requirements and limitations that you apply when you create them, see About Flex-start VMs.
Before you begin
-
Based on the machine type that you want to use, review one of the following configuration requirements:
- For an accelerator-optimized machine type (except A4X or G4), see Overview of creating an instance with attached GPUs.
- For an H4D machine type, see Create an instance that uses Cloud RDMA.
-
If you haven't already, set up authentication.
Authentication verifies your identity for access to Google Cloud services and APIs. To run
code or samples from a local development environment, you can authenticate to
Compute Engine by selecting one of the following options:
Select the tab for how you plan to use the samples on this page:
gcloud
-
Install the Google Cloud CLI. After installation, initialize the Google Cloud CLI by running the following command:
gcloud init
If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.
- Set a default region and zone.
REST
To use the REST API samples on this page in a local development environment, you use the credentials you provide to the gcloud CLI.
Install the Google Cloud CLI. After installation, initialize the Google Cloud CLI by running the following command:
gcloud init
If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.
For more information, see Authenticate for using REST in the Google Cloud authentication documentation.
-
Required roles
To get the permissions that
you need to create Flex-start VMs,
ask your administrator to grant you the
Compute Instance Admin (v1) (roles/compute.instanceAdmin.v1
)
IAM role on the project.
For more information about granting roles, see Manage access to projects, folders, and organizations.
This predefined role contains the permissions required to create Flex-start VMs. To see the exact permissions that are required, expand the Required permissions section:
Required permissions
The following permissions are required to create Flex-start VMs:
-
compute.instances.create
on the project -
To use a custom image to create the VM:
compute.images.useReadOnly
on the image -
To use a snapshot to create the VM:
compute.snapshots.useReadOnly
on the snapshot -
To use an instance template to create the VM:
compute.instanceTemplates.useReadOnly
on the instance template -
To specify a subnet for your VM:
compute.subnetworks.use
on the project or on the chosen subnet -
To specify a static IP address for the VM:
compute.addresses.use
on the project -
To assign an external IP address to the VM when using a VPC network:
compute.subnetworks.useExternalIp
on the project or on the chosen subnet -
To assign a legacy network to the VM:
compute.networks.use
on the project -
To assign an external IP address to the VM when using a legacy network:
compute.networks.useExternalIp
on the project -
To set VM instance metadata for the VM:
compute.instances.setMetadata
on the project -
To set tags for the VM:
compute.instances.setTags
on the VM -
To set labels for the VM:
compute.instances.setLabels
on the VM -
To set a service account for the VM to use:
compute.instances.setServiceAccount
on the VM -
To create a new disk for the VM:
compute.disks.create
on the project -
To attach an existing disk in read-only or read-write mode:
compute.disks.use
on the disk -
To attach an existing disk in read-only mode:
compute.disks.useReadOnly
on the disk
You might also be able to get these permissions with custom roles or other predefined roles.
Create a Flex-start VM
To create a Flex-start VM, select one of the following options:
gcloud
To create a Flex-start VM, use the
gcloud compute instances create
command
with the following flags:
The
--request-valid-for-duration
flagThe
--provisioning-model=FLEX_START
flagThe
--instance-termination-action
flagThe
--max-run-duration
flagThe
--maintenance-policy=TERMINATE
flagThe
--reservation-affinity=none
flag
To create a Flex-start VM, run the following command:
gcloud compute instances create VM_NAME \
--machine-type=MACHINE_TYPE \
--zone=ZONE \
--request-valid-for-duration=VALID_FOR_DURATION \
--provisioning-model=FLEX_START \
--instance-termination-action=TERMINATION_ACTION \
--max-run-duration=RUN_DURATION \
--maintenance-policy=TERMINATE \
--reservation-affinity=none
Replace the following:
VM_NAME
: the name of your new VM.MACHINE_TYPE
: the machine type to use for the Flex-start VM. If you specify a G2 or N1 machine type, then consider the following:For G2 machine types, you can optionally specify a NVIDIA RTX Virtual Workstations (vWS) to use for graphic-intensive workloads. To do so, include the
--accelerator
flag in the command as follows:--accelerator=count=VWS_ACCELERATOR_COUNT,type=nvidia-l4-vws
Replace
VWS_ACCELERATOR_COUNT
with the number of NVIDIA RTX vWS that your workload requires.For N1 machine types, you must specify the number and type of GPUs to attach to your VM. Otherwise, creating the VM fails. To attach GPUs to an N1 VM, include the
--accelerator
flag in the command as follows:--accelerator=count=NUMBER_OF_ACCELERATORS,type=ACCELERATOR_TYPE
Replace the following:
NUMBER_OF_ACCELERATORS
: the number of GPUs to attach to your N1 VM.ACCELERATOR_TYPE
: a supported GPU model for N1 VMs.
ZONE
: the zone where you want to create the VM. To verify that your specified machine type is available in the zone where you want to create the VM, see Available regions and zones.VALID_FOR_DURATION
: the maximum time to wait for provisioning your requested resources, formatted as a number followed by a unit (s
,m
,h
, ord
). For example, a value of30m
defines a time of 30 minutes, and a value of1h2m3s
defines a time of one hour, two minutes, and three seconds. Based on the zonal requirements for your workload, we recommend that you specify one of the following durations to help increase your chances that your VM creation request succeeds:If your workload requires you to create the VM in a specific zone, then specify a duration between 90 seconds (
90s
) and two hours (2h
). Longer durations give you higher chances of obtaining resources.If the VM can run in any zone within the region, then specify a duration of zero seconds (
0s
). This value specifies that Compute Engine only allocates resources if they are immediately available. If the creation request fails because resources are unavailable, then retry the request in a different zone.
TERMINATION_ACTION
: whether to stop or delete the VM at the end of its run duration. Specify one of the following values:To stop the VM:
STOP
To delete the VM:
DELETE
RUN_DURATION
: the maximum time that the VM runs before Compute Engine deletes it, formatted as a number followed by a unit (s
,m
,h
, ord
). The value must be between 10 minutes and seven days.
REST
To create a Flex-start VM, make a POST
request to the
instances.insert
method.
In the request body, include the following fields:
The
params.requestValidForDuration
field.The
scheduling.provisioningModel
field set toFLEX_START
.The
scheduling.instanceTerminationAction
field.The
scheduling.maxRunDuration
field.The
scheduling.onHostMaintenance
field set toTERMINATE
.The
reservationAffinity.consumeReservationType
set toNO_RESERVATION
.
To create a Flex-start VM, make a POST
request as follows:
POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instances
{
"name": "VM_NAME",
"machineType": "zones/ZONE/machineTypes/MACHINE_TYPE",
"disks": [
{
"initializeParams": {
"sourceImage": "projects/IMAGE_PROJECT/global/images/IMAGE"
},
"boot": true
}
],
"networkInterfaces": [
{
"network": "global/networks/default"
}
],
"params": {
"requestValidForDuration": {
"seconds": VALID_FOR_DURATION
}
},
"scheduling": {
"provisioningModel": "FLEX_START",
"instanceTerminationAction": "TERMINATION_ACTION",
"maxRunDuration": {
"seconds": RUN_DURATION
},
"onHostMaintenance": "TERMINATE"
},
"reservationAffinity": {
"consumeReservationType": "NO_RESERVATION"
}
}
Replace the following:
PROJECT_ID
: the ID of the project in which to create the VM.ZONE
: the zone where you want to create the VM. To verify that a machine type is available in the zone where you want to create the VM, see Available regions and zones.VM_NAME
: the name of your new VM.MACHINE_TYPE
: the machine type to use for the Flex-start VM. If you specify a G2 or N1 machine type, then consider the following:For G2 machine types, you can optionally specify a NVIDIA RTX Virtual Workstations (vWS) to use for graphic-intensive workloads. To do so, include the
guestAccelerators
field in the request body as follows:"guestAccelerators": [ { "acceleratorCount": VWS_ACCELERATOR_COUNT, "acceleratorType": "projects/PROJECT_ID/zones/ZONE/acceleratorTypes/nvidia-l4-vws" } ]
Replace
VWS_ACCELERATOR_COUNT
with the number of NVIDIA RTX vWS that your workload requires.For N1 machine types, you must specify the number and type of GPUs to attach to your VM. Otherwise, creating the VM fails. To attach GPUs to an N1 VM, include the
guestAccelerators
field in the request body as follows:"guestAccelerators": [ { "acceleratorCount": ACCELERATOR_COUNT, "acceleratorType": "projects/PROJECT_ID/zones/ZONE/acceleratorTypes/ACCELERATOR_TYPE" } ]
Replace the following:
NUMBER_OF_ACCELERATORS
: the number of GPUs to attach to your N1 VM.ACCELERATOR_TYPE
: a supported GPU model for N1 VMs.
IMAGE_PROJECT
: the image project that contains the image—for example,debian-cloud
. For more information about the supported image projects, see Public images.IMAGE
: specify one of the following:A specific version of the OS image—for example,
debian-12-bookworm-v20240617
.An image family, which must be formatted as
family/IMAGE_FAMILY
. This value specifies to use the most recent, non-deprecated OS image. For example, if you specifyfamily/debian-12
, the latest version in the Debian 12 image family is used. For more information about using image families, see Image families best practices.
VALID_FOR_DURATION
: the maximum time in seconds to wait for the VM to be provisioned. Based on the zonal requirements for your workload, we recommend that you specify one of the following durations to help increase your chances that your VM creation request succeeds:If your workload requires you to create the VM in a specific zone, then specify a duration between 90 seconds (
90
) and two hours (7200
). Longer durations give you higher chances of obtaining resources.If the VM can run in any zone within the region, then specify a duration of zero seconds (
0
). This value specifies that Compute Engine only allocates resources if they are immediately available. If the creation request fails because resources aren't available, then retry the request in a different zone.
TERMINATION_ACTION
: whether to stop or delete the VM at the end of its run duration. Specify one of the following values:To stop the VM:
STOP
To delete the VM:
DELETE
RUN_DURATION
: the maximum time in seconds the VM runs before Compute Engine deletes it. This value must be between 600 seconds (10 minutes) and 604,800 seconds (seven days).
What's next
Try it for yourself
If you're new to Google Cloud, create an account to evaluate how Compute Engine performs in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
Try Compute Engine free