Create an A3 Ultra or A4 instance

This document describes how to create instances with attached GPUs from the A3 Ultra or A4 machine series. To learn more about creating instances with attached GPUs, see Overview of creating an instance with attached GPUs.

A3 Ultra and A4 instances support Cluster Director. With Cluster Director, you can reserve densely allocated machines that provide topology-aware scheduling, as well as enhanced monitoring and maintenance. To learn more about Cluster Director, see Cluster Director in the AI Hypercomputer documentation.

Before you begin

  • To review limitations and additional prerequisite steps for creating instances with attached GPUs, such as how to select an OS image or check GPU quota, see Overview of creating an instance with attached GPUs.
  • If you haven't already, set up authentication. Authentication verifies your identity for access to Google Cloud services and APIs. To run code or samples from a local development environment, you can authenticate to Compute Engine by selecting one of the following options:

    Select the tab for how you plan to use the samples on this page:

    Console

    When you use the Google Cloud console to access Google Cloud services and APIs, you don't need to set up authentication.

    gcloud

    1. Install the Google Cloud CLI. After installation, initialize the Google Cloud CLI by running the following command:

      gcloud init

      If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.

    2. Set a default region and zone.

    REST

    To use the REST API samples on this page in a local development environment, you use the credentials you provide to the gcloud CLI.

      Install the Google Cloud CLI. After installation, initialize the Google Cloud CLI by running the following command:

      gcloud init

      If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.

    For more information, see Authenticate for using REST in the Google Cloud authentication documentation.

Required roles

To get the permissions that you need to create instances, ask your administrator to grant you the Compute Instance Admin (v1) (roles/compute.instanceAdmin.v1) IAM role on the project. For more information about granting roles, see Manage access to projects, folders, and organizations.

This predefined role contains the permissions required to create instances. To see the exact permissions that are required, expand the Required permissions section:

Required permissions

The following permissions are required to create instances:

  • compute.instances.create on the project
  • To use a custom image to create the VM: compute.images.useReadOnly on the image
  • To use a snapshot to create the VM: compute.snapshots.useReadOnly on the snapshot
  • To use an instance template to create the VM: compute.instanceTemplates.useReadOnly on the instance template
  • To specify a subnet for your VM: compute.subnetworks.use on the project or on the chosen subnet
  • To specify a static IP address for the VM: compute.addresses.use on the project
  • To assign an external IP address to the VM when using a VPC network: compute.subnetworks.useExternalIp on the project or on the chosen subnet
  • To assign a legacy network to the VM: compute.networks.use on the project
  • To assign an external IP address to the VM when using a legacy network: compute.networks.useExternalIp on the project
  • To set VM instance metadata for the VM: compute.instances.setMetadata on the project
  • To set tags for the VM: compute.instances.setTags on the VM
  • To set labels for the VM: compute.instances.setLabels on the VM
  • To set a service account for the VM to use: compute.instances.setServiceAccount on the VM
  • To create a new disk for the VM: compute.disks.create on the project
  • To attach an existing disk in read-only or read-write mode: compute.disks.use on the disk
  • To attach an existing disk in read-only mode: compute.disks.useReadOnly on the disk

You might also be able to get these permissions with custom roles or other predefined roles.

Determine how to create A3 Ultra or A4 instances

To determine the options that you want to use to create A3 Ultra or A4 instances, complete the following steps:

  1. Choose a consumption option: To learn how to choose a consumption option for an A3 Ultra or A4 instance, see Choose a consumption option in the AI Hypercomputer documentation.

  2. Obtain capacity: To learn how to obtain capacity for A3 Ultra or A4 instances for the consumption option that you chose, see Capacity overview in the AI Hypercomputer documentation.

  3. Select creation instructions: To learn about all the options that you can use to create A3 Ultra or A4 instances, such as managed instance groups (MIGs) or clusters, see Overview of creating VMs and clusters in the AI Hypercomputer documentation.

    If you want to use Cluster Director features or if you don't want to create standalone instances, then select a creation option in the AI Hypercomputer documentation instead.

Create an A3 Ultra or A4 instance

To create an A3 Ultra or A4 instance, complete the following steps:

  1. Create VPC networks

  2. Create the instance

  3. Prepare the instance for use

Create VPC networks

To set up the network for A4 or A3 Ultra machine type, create three VPC networks for the following network interfaces:

  • 2 regular VPC networks for the gVNIC network interfaces (NIC). These are used for host to host communication.
  • 1 VPC network with the RoCE network profile is required for the CX-7 NICs. The RoCE VPC network needs to have 8 subnets, one subnet for each CX-7 NIC. These NICs use RDMA over Converged Ethernet (RoCE), providing the high-bandwidth, low-latency communication that's essential for GPU to GPU communication.

For more information about NIC arrangement, see Review network bandwidth and NIC arrangement.

Create the networks either manually by following the instruction guides or automatically by using the provided script.

Instruction guides

To create the networks, you can use the following instructions:

For these VPC networks, we recommend setting the maximum transmission unit (MTU) to a larger value. For A4 or A3 Ultra machine type, the recommended MTU is 8896 bytes. To review the recommended MTU settings for other GPU machine types, see MTU settings for GPU machine types.

Script

To create the networks, follow these steps.

For these VPC networks, we recommend setting the maximum transmission unit (MTU) to a larger value. For A4 or A3 Ultra machine type, the recommended MTU is 8896 bytes. To review the recommended MTU settings for other GPU machine types, see MTU settings for GPU machine types.

  1. Use the following script to create VPC networks for the gVNICs and CX-7 NICs.

      
        #!/bin/bash
    
        # Create regular VPC networks and subnets for the gVNICs
        for N in $(seq 0 1); do
          gcloud compute networks create GVNIC_NAME_PREFIX-net-$N \
            --subnet-mode=custom \
            --mtu=8896
    
          gcloud compute networks subnets create GVNIC_NAME_PREFIX-sub-$N \
            --network=GVNIC_NAME_PREFIX-net-$N \
            --region=REGION \
            --range=10.$N.0.0/16
    
          gcloud compute firewall-rules create GVNIC_NAME_PREFIX-internal-$N \
            --network=GVNIC_NAME_PREFIX-net-$N \
            --action=ALLOW \
            --rules=tcp:0-65535,udp:0-65535,icmp \
            --source-ranges=10.0.0.0/8
        done
    
        # Create SSH firewall rules
        gcloud compute firewall-rules create GVNIC_NAME_PREFIX-ssh \
          --network=GVNIC_NAME_PREFIX-net-0 \
          --action=ALLOW \
          --rules=tcp:22 \
          --source-ranges=IP_RANGE
    
        # Assumes that an external IP is only created for vNIC 0
        gcloud compute firewall-rules create GVNIC_NAME_PREFIX-allow-ping-net-0 \
          --network=GVNIC_NAME_PREFIX-net-0 \
          --action=ALLOW \
          --rules=icmp \
          --source-ranges=IP_RANGE
    
      
        # List and make sure network profiles exist in the machine type's zone
        gcloud compute network-profiles list --filter "location.name=ZONE"
    
        # Create network for CX-7
        gcloud compute networks create RDMA_NAME_PREFIX-mrdma \
          --network-profile=ZONE-vpc-roce \
          --subnet-mode custom \
          --mtu=8896
    
        # Create subnets
        for N in $(seq 0 7); do
          gcloud compute networks subnets create RDMA_NAME_PREFIX-mrdma-sub-$N \
            --network=RDMA_NAME_PREFIX-mrdma \
            --region=REGION \
            --range=10.$((N+2)).0.0/16 # offset to avoid overlap with gVNICs
        done
    
      

    Replace the following:

    • GVNIC_NAME_PREFIX: the custom name prefix to use for the regular VPC networks and subnets for the gVNICs.
    • RDMA_NAME_PREFIX: the custom name prefix to use for the RoCE VPC network and subnets for the CX-7 NICs.
    • ZONE: specify a zone in which the machine type that you want to use is available, such as us-central1-a. For information about regions, see GPU availability by regions and zones.
    • REGION: the region where you want to create the subnets. This region must correspond to the zone specified. For example, if your zone is us-central1-a, then your region is us-central1.
    • IP_RANGE: the IP range to use for the SSH firewall rules.
  2. Optional: To verify that the VPC network resources are created successfully, check the network settings in the Google Cloud console:
    1. In the Google Cloud console, go to the VPC networks page.

      Go to VPC networks

    2. Search the list for the networks that you created in the previous step.
    3. To view the subnets, firewall rules, and other network settings, click the name of the network.

Create the instance

To create an instance, use one of the following options.

Console

  1. In the Google Cloud console, go to the Create an instance page.

    Go to Create an instance

    The Create an instance screen appears and displays the Machine configuration pane.

  2. In the Machine configuration pane, complete the following steps:

    1. Specify a Name for your instance. See Resource naming convention.

    2. Select the Region and Zone where you have reserved capacity.

    3. Click the GPUs tab, and then complete the following steps:

      1. In the GPU type list, select your GPU type.

        • For A4 instances, select NVIDIA B200.

        • For A3 Ultra instances, select NVIDIA H200 141GB.

      2. In the Number of GPUs list, select 8.

  3. In the navigation menu, click OS and storage. In the OS and storage pane that appears, complete the following steps:

    1. Click Change. The Boot disk configuration pane appears.

    2. On the Public images tab, select a recommended image. For a list of recommended images, see Operating systems.

    3. To confirm your boot disk options, click Select.

  4. To create a multi-NIC instance, complete the following steps. Otherwise, to create a single-NIC instance, skip these steps.

    • In the navigation menu, click Networking. In the Networking pane that appears, complete the following steps:

      1. In the Network interfaces section, complete the following steps:

      2. Delete the default network interface. To delete the interface, click Delete.

      3. Click Add a network interface. Use this option to add network interfaces that attach to the VPC networks that you created in the previous section. When you add the network interfaces, remember the following:

        • For a network interface that is used for host to host communication, select a regular VPC network and subnet from the Network and Subnetwork lists, and set the Network interface card list to gVNIC.

        • For a network interface that is used for GPU to GPU communication, select the RoCE VPC network and subnet from the Network and Subnetwork lists, and set the Network interface card list to MRDMA for these network interfaces.

  5. In the navigation menu, click Advanced. Then, complete the following steps for the provisioning model that you want to use.

    Flex-start

    1. In the Provisioning model section, in the VM provisioning model list, select Flex-start.
    2. In the Enter number of hours, enter the maximum amount of time that you want the VM to run. The value must be between 46 seconds (0.01) and seven days (168, or 168 hours).

    3. Select Set a wait time for VM creation.

      Based on the zonal requirements for your workload, we recommend that you specify one of the following durations to help increase your chances that your VM creation request succeeds:

      • Workloads with strict zonal requirements: if your workload requires you to create the VM in a specific zone, then specify a duration between 90 seconds and 2 hours. Longer durations give you higher chances of obtaining resources.
      • Workloads without strict zonal requirements: if the VM can run in any zone within the region, then specify a duration of 0 seconds or clear the Set a wait time for VM creation checkbox. This action specifies that Compute Engine only allocates resources if they are immediately available. If the VM creation request fails because resources are unavailable, then retry the request in a different zone.

    Reservation-bound

    1. Click Choose a reservation. This action opens a pane with a list of available reservations within your selected zone. From the reservation list, complete the following steps:

      1. Select the reservation that you want to use for the VM. You can also select a specific block within the reservation.
      2. Click Choose.

    Spot

    1. In the Provisioning model section, select Spot from the VM provisioning model list.
    2. Optional: To select the termination action that happens when Compute Engine preempts the VM, complete the following steps:

      1. Expand the VM provisioning model advanced settings section.
      2. In the On VM termination list, select one of the following options:
        • To stop the VM during preemption, select Stop (default).
        • To delete the VM during preemption, select Delete.
  6. To create and start the instance, click Create.

gcloud

To create the VM, use the gcloud compute instances create command.

The parameters that you need to specify depend on the consumption option that you are using for this deployment. Select the tab that corresponds to your consumption option's provisioning model.

Flex-start

gcloud compute instances create VM_NAME  \
    --machine-type=MACHINE_TYPE \
    --image-family=IMAGE_FAMILY \
    --image-project=IMAGE_PROJECT \
    --zone=ZONE \
    --boot-disk-type=hyperdisk-balanced \
    --boot-disk-size=DISK_SIZE \
    --scopes=cloud-platform \
    --network-interface=nic-type=GVNIC,network=GVNIC_NAME_PREFIX-net-0,subnet=GVNIC_NAME_PREFIX-sub-0 \
    --network-interface=nic-type=GVNIC,network=GVNIC_NAME_PREFIX-net-1,subnet=GVNIC_NAME_PREFIX-sub-1,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-0,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-1,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-2,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-3,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-4,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-5,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-6,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-7,no-address \
    --reservation-affinity=none \
    --provisioning-model=FLEX_START \
    --request-valid-for-duration=REQUEST_VALID_FOR_DURATION \
    --max-run-duration=MAX_RUN_DURATION \
    --instance-termination-action=DELETE \
    --maintenance-policy=TERMINATE

Replace the following:

  • VM_NAME: the name of the VM.
  • MACHINE_TYPE: the machine type to use for the VM. For more information, see GPU machine types.
  • IMAGE_FAMILY: the image family of the OS image that you want to use. For a list of supported operating systems, see Operating system details.
  • IMAGE_PROJECT: the project ID of the OS image.
  • ZONE: the zone in which the machine type that you want to use is available. For information about regions, see GPU availability by regions and zones.
  • DISK_SIZE: the size of the boot disk in GB.
  • GVNIC_NAME_PREFIX: the name prefix that you specified when creating the standard VPC networks and subnets that use gVNICs.
  • RDMA_NAME_PREFIX: the name prefix that you specified when creating the VPC networks and subnets that use RDMA NICs.
  • REQUEST_VALID_FOR_DURATION: the duration that the request to create the VM is valid for. You must format the value as the number of days, hours, minutes, or seconds followed by d, h, m, and s respectively. For example, specify 30m for 30 minutes or 1d2h3m4s for one day, two hours, three minutes, and four seconds. If you don't specify a duration, then the default duration is 90 seconds.

    Based on the zonal requirements for your workload, we recommend that you specify one of the following durations to help increase your chances that your VM creation request succeeds:

    • Workloads with strict zonal requirements: if your workload requires you to create the VM in a specific zone, then specify a duration between 90 seconds and two hours. Longer durations give you higher chances of obtaining resources.
    • Workloads without strict zonal requirements: if the VM can run in any zone within the region, then specify a duration of zero seconds (0). This action specifies that Compute Engine only allocates resources if they are immediately available. If the VM creation request fails because resources are unavailable, then retry the request in a different zone.
  • MAX_RUN_DURATION: the duration you want the requested VMs to run. You must format the value as the number of days, hours, minutes, or seconds followed by d, h, m, and s respectively. For example, specify 30m for 30 minutes or 1d2h3m4s for one day, two hours, three minutes, and four seconds. The value must be between 10 minutes and seven days.

Reservation-bound

gcloud compute instances create VM_NAME  \
    --machine-type=MACHINE_TYPE \
    --image-family=IMAGE_FAMILY \
    --image-project=IMAGE_PROJECT \
    --zone=ZONE \
    --boot-disk-type=hyperdisk-balanced \
    --boot-disk-size=DISK_SIZE \
    --scopes=cloud-platform \
    --network-interface=nic-type=GVNIC,network=GVNIC_NAME_PREFIX-net-0,subnet=GVNIC_NAME_PREFIX-sub-0 \
    --network-interface=nic-type=GVNIC,network=GVNIC_NAME_PREFIX-net-1,subnet=GVNIC_NAME_PREFIX-sub-1,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-0,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-1,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-2,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-3,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-4,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-5,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-6,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-7,no-address \
    --reservation-affinity=specific \
    --reservation=RESERVATION \
    --provisioning-model=RESERVATION_BOUND \
    --instance-termination-action=TERMINATION_ACTION \
    --maintenance-policy=TERMINATE

Replace the following:

  • VM_NAME: the name of the VM.
  • MACHINE_TYPE: the machine type to use for the VM. For more information, see GPU machine types.
  • IMAGE_FAMILY: the image family of the OS image that you want to use. For a list of supported operating systems, see Operating system details.
  • IMAGE_PROJECT: the project ID of the OS image.
  • ZONE: the zone in which the machine type that you want to use is available. For information about regions, see GPU availability by regions and zones.
  • DISK_SIZE: the size of the boot disk in GB.
  • GVNIC_NAME_PREFIX: the name prefix that you specified when creating the standard VPC networks and subnets that use gVNICs.
  • RDMA_NAME_PREFIX: the name prefix that you specified when creating the VPC networks and subnets that use RDMA NICs.
  • RESERVATION: either the reservation name or a specific block within a reservation. To get the reservation name or the available blocks, see View reserved capacity. Based on your requirement for instance placement, choose one of the following:
    • To create the instance on any block:

      projects/RESERVATION_OWNER_PROJECT_ID/reservations/RESERVATION_NAME

      Additionally, to create multiple instances in the same block, apply the same compact placement policy that specifies a block collocation (maxDistance=2) when creating each instance. Compute Engine then applies the policy to the reservation and creates instances on the same block.

    • To create the instance on a specific block:

      projects/RESERVATION_OWNER_PROJECT_ID/reservations/RESERVATION_NAME/reservationBlocks/RESERVATION_BLOCK_NAME
  • TERMINATION_ACTION: whether Compute Engine stops (STOP) or deletes (DELETE) the VM at the end of the reservation period.

Spot

gcloud compute instances create VM_NAME  \
    --machine-type=MACHINE_TYPE \
    --image-family=IMAGE_FAMILY \
    --image-project=IMAGE_PROJECT \
    --zone=ZONE \
    --boot-disk-type=hyperdisk-balanced \
    --boot-disk-size=DISK_SIZE \
    --scopes=cloud-platform \
    --network-interface=nic-type=GVNIC,network=GVNIC_NAME_PREFIX-net-0,subnet=GVNIC_NAME_PREFIX-sub-0 \
    --network-interface=nic-type=GVNIC,network=GVNIC_NAME_PREFIX-net-1,subnet=GVNIC_NAME_PREFIX-sub-1,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-0,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-1,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-2,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-3,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-4,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-5,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-6,no-address \
    --network-interface=nic-type=MRDMA,network=RDMA_NAME_PREFIX-mrdma,subnet=RDMA_NAME_PREFIX-mrdma-sub-7,no-address \
    --provisioning-model=SPOT \
    --instance-termination-action=TERMINATION_ACTION

Replace the following:

  • VM_NAME: the name of the VM.
  • MACHINE_TYPE: the machine type to use for the VM. For more information, see GPU machine types.
  • IMAGE_FAMILY: the image family of the OS image that you want to use. For a list of supported operating systems, see Operating system details.
  • IMAGE_PROJECT: the project ID of the OS image.
  • ZONE: the zone in which the machine type that you want to use is available. For information about regions, see GPU availability by regions and zones.
  • DISK_SIZE: the size of the boot disk in GB.
  • GVNIC_NAME_PREFIX: the name prefix that you specified when creating the standard VPC networks and subnets that use gVNICs.
  • RDMA_NAME_PREFIX: the name prefix that you specified when creating the VPC networks and subnets that use RDMA NICs.
  • TERMINATION_ACTION: the action to take when Compute Engine preempts the instance, either STOP (default) or DELETE.

REST

To create the VM, make a POST request to the instances.insert method.

The parameters that you need to specify depend on the consumption option that you are using for this deployment. Select the tab that corresponds to your consumption option's provisioning model.

Flex-start

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instances
{
  "machineType": "projects/PROJECT_ID/zones/ZONE/machineTypes/MACHINE_TYPE",
  "name": "VM_NAME",
  "disks":[
    {
      "boot":true,
      "initializeParams":{
        "diskSizeGb": "DISK_SIZE",
        "diskType": "hyperdisk-balanced",
        "sourceImage": "projects/IMAGE_PROJECT/global/images/family/IMAGE_FAMILY"
      },
      "mode": "READ_WRITE",
      "type": "PERSISTENT"
    }
  ],
  "serviceAccounts": [
    {
      "email": "default",
      "scopes": [
        "https://www.googleapis.com/auth/cloud-platform"
      ]
    }
  ],
  "networkInterfaces": [
    {
      "accessConfigs": [
        {
          "name": "external-nat",
          "type": "ONE_TO_ONE_NAT"
        }
      ],
      "network": "projects/NETWORK_PROJECT_ID/global/networks/GVNIC_NAME_PREFIX-net-0",
      "nicType": "GVNIC",
      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/GVNIC_NAME_PREFIX-sub-0"
    },
    {
      "network": "projects/NETWORK_PROJECT_ID/global/networks/GVNIC_NAME_PREFIX-net-1",
      "nicType": "GVNIC",
      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/GVNIC_NAME_PREFIX-sub-1"
    },
    {
      "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
      "nicType": "MRDMA",
      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-0"
    },
    {
      "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
      "nicType": "MRDMA",
      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-1"
    },
    {
      "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
      "nicType": "MRDMA",
      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-2"
    },
    {
      "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
      "nicType": "MRDMA",
      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-3"
    },
    {
      "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
      "nicType": "MRDMA",
      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-4"
    },
    {
      "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
      "nicType": "MRDMA",
      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-5"
    },
    {
      "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
      "nicType": "MRDMA",
      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-6"
    },
    {
      "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
      "nicType": "MRDMA",
      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-7"
    }
  ],
  "reservationAffinity":{
    "consumeReservationType": "NO_RESERVATION",
  },
  "scheduling":{
    "provisioningModel": "FLEX_START",
    "requestValidForDuration": {
      "seconds": REQUEST_VALID_FOR_DURATION
    },
    "maxRunDuration": {
      "seconds": MAX_RUN_DURATION
    },
    "instanceTerminationAction": "DELETE",
    "onHostMaintenance": "TERMINATE",
  }
}

Replace the following:

  • PROJECT_ID: the project ID of the project where you want to create the VM.
  • ZONE: the zone in which the machine type that you want to use is available. For information about regions, see GPU availability by regions and zones.
  • MACHINE_TYPE: the machine type to use for the VM. For more information, see GPU machine types.
  • VM_NAME: the name of the VM.
  • DISK_SIZE: the size of the boot disk in GB.
  • IMAGE_PROJECT: the project ID of the OS image.
  • IMAGE_FAMILY: the image family of the OS image that you want to use. For a list of supported operating systems, see Operating system details.
  • NETWORK_PROJECT_ID: the project ID of the network.
  • GVNIC_NAME_PREFIX: the name prefix that you specified when creating the standard VPC networks and subnets that use gVNICs.
  • REGION: the region of the subnetwork.
  • RDMA_NAME_PREFIX: the name prefix that you specified when creating the VPC networks and subnets that use RDMA NICs.
  • REQUEST_VALID_FOR_DURATION: the duration that the request to create the VM is valid for. You must format the value as the number of days, hours, minutes, or seconds followed by d, h, m, and s respectively. For example, specify 30m for 30 minutes or 1d2h3m4s for one day, two hours, three minutes, and four seconds. If you don't specify a duration, then the default duration is 90 seconds.

    Based on the zonal requirements for your workload, we recommend that you specify one of the following durations to help increase your chances that your VM creation request succeeds:

    • Workloads with strict zonal requirements: if your workload requires you to create the VM in a specific zone, then specify a duration between 90 seconds and two hours. Longer durations give you higher chances of obtaining resources.
    • Workloads without strict zonal requirements: if the VM can run in any zone within the region, then specify a duration of zero seconds (0). This action specifies that Compute Engine only allocates resources if they are immediately available. If the VM creation request fails because resources are unavailable, then retry the request in a different zone.
  • MAX_RUN_DURATION: the duration you want the requested VMs to run. You must format the value as the number of seconds. For example, specify 86400 for one day. The value must be between 10 minutes and seven days.

Reservation-bound

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instances
{
  "machineType": "projects/PROJECT_ID/zones/ZONE/machineTypes/MACHINE_TYPE",
  "name": "VM_NAME",
  "disks":[
    {
      "boot":true,
      "initializeParams":{
        "diskSizeGb": "DISK_SIZE",
        "diskType": "hyperdisk-balanced",
        "sourceImage": "projects/IMAGE_PROJECT/global/images/family/IMAGE_FAMILY"
      },
      "mode": "READ_WRITE",
      "type": "PERSISTENT"
    }
  ],
  "serviceAccounts": [
    {
      "email": "default",
      "scopes": [
        "https://www.googleapis.com/auth/cloud-platform"
      ]
    }
  ],
  "networkInterfaces": [
    {
      "accessConfigs": [
        {
          "name": "external-nat",
          "type": "ONE_TO_ONE_NAT"
        }
      ],
      "network": "projects/NETWORK_PROJECT_ID/global/networks/GVNIC_NAME_PREFIX-net-0",
      "nicType": "GVNIC",
      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/GVNIC_NAME_PREFIX-sub-0"
    },
    {
      "network": "projects/NETWORK_PROJECT_ID/global/networks/GVNIC_NAME_PREFIX-net-1",
      "nicType": "GVNIC",
      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/GVNIC_NAME_PREFIX-sub-1"
    },
    {
      "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
      "nicType": "MRDMA",
      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-0"
    },
    {
      "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
      "nicType": "MRDMA",
      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-1"
    },
    {
      "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
      "nicType": "MRDMA",
      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-2"
    },
    {
      "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
      "nicType": "MRDMA",
      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-3"
    },
    {
      "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
      "nicType": "MRDMA",
      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-4"
    },
    {
      "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
      "nicType": "MRDMA",
      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-5"
    },
    {
      "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
      "nicType": "MRDMA",
      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-6"
    },
    {
      "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
      "nicType": "MRDMA",
      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-7"
    }
  ],
  "reservationAffinity":{
    "consumeReservationType": "SPECIFIC_RESERVATION",
    "key": "compute.googleapis.com/reservation-name",
    "values":[
      "RESERVATION"
    ]
  },
  "scheduling":{
    "provisioningModel": "RESERVATION_BOUND",
    "instanceTerminationAction": "TERMINATION_ACTION",
    "onHostMaintenance": "TERMINATE",
    "automaticRestart": true
  }
}

Replace the following:

  • PROJECT_ID: the project ID of the project where you want to create the VM.
  • ZONE: the zone in which the machine type that you want to use is available. For information about regions, see GPU availability by regions and zones.
  • MACHINE_TYPE: the machine type to use for the VM. For more information, see GPU machine types.
  • VM_NAME: the name of the VM.
  • DISK_SIZE: the size of the boot disk in GB.
  • IMAGE_PROJECT: the project ID of the OS image.
  • IMAGE_FAMILY: the image family of the OS image that you want to use. For a list of supported operating systems, see Operating system details.
  • NETWORK_PROJECT_ID: the project ID of the network.
  • GVNIC_NAME_PREFIX: the name prefix that you specified when creating the standard VPC networks and subnets that use gVNICs.
  • REGION: the region of the subnetwork.
  • RDMA_NAME_PREFIX: the name prefix that you specified when creating the VPC networks and subnets that use RDMA NICs.
  • RESERVATION: either the reservation name or a specific block within a reservation. To get the reservation name or the available blocks, see View reserved capacity. Based on your requirement for instance placement, choose one of the following:
    • To create the instance on any block:

      projects/RESERVATION_OWNER_PROJECT_ID/reservations/RESERVATION_NAME

      Additionally, to create multiple instances in the same block, apply the same compact placement policy that specifies a block collocation (maxDistance=2) when creating each instance. Compute Engine then applies the policy to the reservation and creates instances on the same block.

    • To create the instance on a specific block:

      projects/RESERVATION_OWNER_PROJECT_ID/reservations/RESERVATION_NAME/reservationBlocks/RESERVATION_BLOCK_NAME
  • TERMINATION_ACTION: whether Compute Engine stops (STOP) or deletes (DELETE) the VM at the end of the reservation period.

Spot

POST https://compute.googleapis.com/compute/v1/projects/PROJECT_ID/zones/ZONE/instances
{
  "machineType": "projects/PROJECT_ID/zones/ZONE/machineTypes/MACHINE_TYPE",
  "name": "VM_NAME",
  "disks":[
    {
      "boot":true,
      "initializeParams":{
        "diskSizeGb": "DISK_SIZE",
        "diskType": "hyperdisk-balanced",
        "sourceImage": "projects/IMAGE_PROJECT/global/images/family/IMAGE_FAMILY"
      },
      "mode": "READ_WRITE",
      "type": "PERSISTENT"
    }
  ],
  "serviceAccounts": [
    {
      "email": "default",
      "scopes": [
        "https://www.googleapis.com/auth/cloud-platform"
      ]
    }
  ],
  "networkInterfaces": [
    {
      "accessConfigs": [
        {
          "name": "external-nat",
          "type": "ONE_TO_ONE_NAT"
        }
      ],
      "network": "projects/NETWORK_PROJECT_ID/global/networks/GVNIC_NAME_PREFIX-net-0",
      "nicType": "GVNIC",
      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/GVNIC_NAME_PREFIX-sub-0"
    },
    {
      "network": "projects/NETWORK_PROJECT_ID/global/networks/GVNIC_NAME_PREFIX-net-1",
      "nicType": "GVNIC",
      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/GVNIC_NAME_PREFIX-sub-1"
    },
    {
      "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
      "nicType": "MRDMA",
      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-0"
    },
    {
      "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
      "nicType": "MRDMA",
      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-1"
    },
    {
      "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
      "nicType": "MRDMA",
      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-2"
    },
    {
      "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
      "nicType": "MRDMA",
      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-3"
    },
    {
      "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
      "nicType": "MRDMA",
      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-4"
    },
    {
      "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
      "nicType": "MRDMA",
      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-5"
    },
    {
      "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
      "nicType": "MRDMA",
      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-6"
    },
    {
      "network": "projects/NETWORK_PROJECT_ID/global/networks/RDMA_NAME_PREFIX-mrdma",
      "nicType": "MRDMA",
      "subnetwork": "projects/NETWORK_PROJECT_ID/region/REGION/subnetworks/RDMA_NAME_PREFIX-mrdma-sub-7"
    }
  ],
  "scheduling":
  {
    "provisioningModel": "SPOT",
    "instanceTerminationAction": "TERMINATION_ACTION"
  }
}

Replace the following:

  • PROJECT_ID: the project ID of the project where you want to create the VM.
  • ZONE: the zone in which the machine type that you want to use is available. For information about regions, see GPU availability by regions and zones.
  • MACHINE_TYPE: the machine type to use for the VM. For more information, see GPU machine types.
  • VM_NAME: the name of the VM.
  • DISK_SIZE: the size of the boot disk in GB.
  • IMAGE_PROJECT: the project ID of the OS image.
  • IMAGE_FAMILY: the image family of the OS image that you want to use. For a list of supported operating systems, see Operating system details.
  • NETWORK_PROJECT_ID: the project ID of the network.
  • GVNIC_NAME_PREFIX: the name prefix that you specified when creating the standard VPC networks and subnets that use gVNICs.
  • REGION: the region of the subnetwork.
  • RDMA_NAME_PREFIX: the name prefix that you specified when creating the VPC networks and subnets that use RDMA NICs.
  • TERMINATION_ACTION: the action to take when Compute Engine preempts the instance, either STOP (default) or DELETE.

Prepare the instance for use

To prepare an instance that has GPUs attached for use, complete the following steps:

  1. To enable an A4 or A3 Ultra instance to use its attached GPUs, the instance must have GPU drivers installed. Unless the image in the instance already includes the required GPU drivers, install GPU drivers.

  2. If you created a Spot VM in the previous section, then complete the following steps:

What's next