Consuming reserved zonal resources


You can reserve Compute Engine instances in a specific zone to ensure resources are available for their workloads when needed. For more details on how to manage reservations, see Reserving Compute Engine zonal resources.

After creating reservations, you can consume the reserved resources in GKE. GKE supports the same consumption modes as Compute Engine:

  • Consuming resources from any reservations: Standard only
  • Consuming resources from a specific reservation: Standard and Autopilot
  • Creating nodes without consuming any reservations: Standard and Autopilot

Before you begin

Before you start, make sure you have performed the following tasks:

  • Enable the Google Kubernetes Engine API.
  • Enable Google Kubernetes Engine API
  • If you want to use the Google Cloud CLI for this task, install and then initialize the gcloud CLI. If you previously installed the gcloud CLI, get the latest version by running gcloud components update.

Consume capacity reservations in Autopilot clusters

Autopilot clusters support consuming resources from specific Compute Engine capacity reservations in the same project or in a shared project. Unless explicitly specified, Autopilot clusters don't consume reservations. These reservations qualify for Autopilot Committed Use Discounts. You must use the Accelerator compute class or the Performance compute class to consume capacity reservations.

  • Before you begin, create an Autopilot cluster running the following versions:

    • To consume reserved accelerators using the Accelerator compute class: 1.28.6-gke.1095000 or later
    • To use the Performance compute class: 1.28.6-gke.1369000 and later or version 1.29.1-gke.1575000 and later.

Create capacity reservations for Autopilot

Autopilot Pods can consume specific reservations in the same project as the cluster or in a shared reservation from a different project. You can consume the reserved hardware by explicitly referencing that reservation in your manifest. You can consume reservations in Autopilot for the following types of hardware:

  • Any of the following types of GPUs:
    • nvidia-h100-80gb: NVIDIA H100 (80GB) (only available with Accelerator compute class)
    • nvidia-a100-80gb: NVIDIA A100 (80GB)
    • nvidia-tesla-a100: NVIDIA A100 (40GB)
    • nvidia-l4: NVIDIA L4
    • nvidia-tesla-t4: NVIDIA T4

To create a capacity reservation, see the following resources. Ensure that the machine types, accelerator types, and accelerator quantities match what your workloads will consume.

Consume a specific reservation in the same project in Autopilot

This section shows you how to consume a specific capacity reservation that's in the same project as your cluster.

  1. Save the following manifest as specific-autopilot.yaml. This manifest has node selectors that consume a specific reservation.

    VM instances

    apiVersion: v1
    kind: Pod
    metadata:
      name: specific-same-project-pod
    spec:
      nodeSelector:
        cloud.google.com/compute-class: Performance
        cloud.google.com/machine-family: MACHINE_SERIES
        cloud.google.com/reservation-name: RESERVATION_NAME
        cloud.google.com/reservation-affinity: "specific"
      containers:
      - name: my-container
        image: "k8s.gcr.io/pause"
        resources:
          requests:
            cpu: 12
            memory: "50Gi"
            ephemeral: "200Gi"
    

    Replace the following:

    • MACHINE_SERIES: a machine series that contains the machine type of the VMs in your specific capacity reservation. For example, if your reservation is for c3-standard-4 machine types, specify C3 in the MACHINE_SERIES field.
    • RESERVATION_NAME: the name of the Compute Engine capacity reservation.

    Accelerators

    apiVersion: v1
    kind: Pod
    metadata:
      name: specific-same-project-pod
    spec:
      nodeSelector:
        cloud.google.com/compute-class: "Accelerator"
        cloud.google.com/gke-accelerator: ACCELERATOR
        cloud.google.com/reservation-name: RESERVATION_NAME
        cloud.google.com/reservation-affinity: "specific"
      containers:
      - name: my-container
        image: "k8s.gcr.io/pause"
        resources:
          requests:
            cpu: 12
            memory: "50Gi"
            ephemeral: "200Gi"
          limits:
            nvidia.com/gpu: QUANTITY
    

    Replace the following:

    • ACCELERATOR: the accelerator that you reserved in the Compute Engine capacity reservation. Must be one of the following values:
      • nvidia-h100-80gb: NVIDIA H100 (80GB) (only available with Accelerator compute class)
      • nvidia-a100-80gb: NVIDIA A100 (80GB)
      • nvidia-tesla-a100: NVIDIA A100 (40GB)
      • nvidia-l4: NVIDIA L4
      • nvidia-tesla-t4: NVIDIA T4
    • RESERVATION_NAME: the name of the Compute Engine capacity reservation.
    • QUANTITY: the number of GPUs to attach to the container. Must be a supported quantity for the specified GPU, as described in Supported GPU quantities.
  2. Deploy the Pod:

    kubectl apply -f specific-autopilot.yaml
    

Autopilot uses the reserved capacity in the specified reservation to provision a new node to place the Pod.

Consume a specific shared reservation in Autopilot

This section uses the following terms:

  • Owner project: the project that owns the reservation and shares it with other projects.
  • Consumer project: the project that runs the workloads that consume the shared reservation.

To consume a shared reservation, you must grant the GKE service agent access to the reservation in the project that owns the reservation. Do the following:

  1. Create a custom IAM role that contains the compute.reservations.list permission in the owner project:

    gcloud iam roles create ROLE_NAME \
        --project=OWNER_PROJECT_ID \
        --permissions='compute.reservations.list'
    

    Replace the following:

    • ROLE_NAME: a name for your new role.
    • OWNER_PROJECT_ID: the project ID of the project that owns the capacity reservation.
  2. Give the GKE service agent in the consumer project access to the shared reservation in the owner project:

    gcloud compute reservations add-iam-policy-binding RESERVATION_NAME \
        --project=OWNER_PROJECT_ID \
        --zone=ZONE \
        --member=service-CONSUMER_PROJECT_NUMBER@container-engine-robot.iam.gserviceaccount.com \
        --role='roles/ROLE_NAME'
    

    Replace CONSUMER_PROJECT_NUMBER with the numerical project number of your consumer project. To find this number, see Identifying projects in the Resource Manager documentation.

  3. Save the following manifest as shared-autopilot.yaml. This manifest has nodeSelectors that tell GKE to consume a specific shared reservation.

    VM instances

    apiVersion: v1
    kind: Pod
    metadata:
      name: performance-pod
    spec:
      nodeSelector:
        cloud.google.com/compute-class: Performance
        cloud.google.com/machine-family: MACHINE_SERIES
        cloud.google.com/reservation-name: RESERVATION_NAME
        cloud.google.com/reservation-project: OWNER_PROJECT_ID
        cloud.google.com/reservation-affinity: "specific"
      containers:
      - name: my-container
        image: "k8s.gcr.io/pause"
        resources:
          requests:
            cpu: 12
            memory: "50Gi"
            ephemeral: "200Gi"
    

    Replace the following:

    • MACHINE_SERIES: a machine series that contains the machine type of the VMs in your specific capacity reservation. For example, if your reservation is for c3-standard-4 machine types, specify C3 in the MACHINE_SERIES field.
    • RESERVATION_NAME: the name of the Compute Engine capacity reservation.
    • OWNER_PROJECT_ID: the project ID of the project that owns the capacity reservation.

    Accelerators

    apiVersion: v1
    kind: Pod
    metadata:
      name: specific-same-project-pod
    spec:
      nodeSelector:
        cloud.google.com/compute-class: "Accelerator"
        cloud.google.com/gke-accelerator: ACCELERATOR
        cloud.google.com/reservation-name: RESERVATION_NAME
        cloud.google.com/reservation-project: OWNER_PROJECT_ID
        cloud.google.com/reservation-affinity: "specific"
      containers:
      - name: my-container
        image: "k8s.gcr.io/pause"
        resources:
          requests:
            cpu: 12
            memory: "50Gi"
            ephemeral: "200Gi"
          limits:
            nvidia.com/gpu: QUANTITY
    

    Replace the following:

    • ACCELERATOR: the accelerator that you reserved in the Compute Engine capacity reservation. Must be one of the following values:
      • nvidia-h100-80gb: NVIDIA H100 (80GB) (only available with Accelerator compute class)
      • nvidia-a100-80gb: NVIDIA A100 (80GB)
      • nvidia-tesla-a100: NVIDIA A100 (40GB)
      • nvidia-l4: NVIDIA L4
      • nvidia-tesla-t4: NVIDIA T4
    • RESERVATION_NAME: the name of the Compute Engine capacity reservation.
    • OWNER_PROJECT_ID: the project ID of the project that owns the capacity reservation.
    • QUANTITY: the number of GPUs to attach to the container. Must be a supported quantity for the specified GPU, as described in Supported GPU quantities.
  4. Deploy the Pod:

    kubectl apply -f shared-autopilot.yaml
    

Autopilot uses the reserved capacity in the specified reservation to provision a new node to place the Pod.

Consuming reserved instances in GKE Standard

When you create a cluster or node pool, you can indicate the reservation consumption mode by specifying the --reservation-affinity flag.

Consuming any matching reservations

To consume from any matching reservations automatically, set the reservation affinity flag to --reservation-affinity=any.

In the any reservation consumption mode, nodes first take capacity from all single-project reservations before any shared reservations, because the shared reservations are more available to other projects. For more information about how instances are automatically consumed see Consumption order.

To create a reservation and instances to consume any reservation, perform the following steps:

  1. Create a reservation of three VM instances:

    gcloud compute reservations create RESERVATION_NAME \
        --machine-type=MACHINE_TYPE --vm-count=3
    

    Replace the following:

    • RESERVATION_NAME: the name of the reservation to create.
    • MACHINE_TYPE: the type of machine (name only) to use for the reservation. For example, n1-standard-2.
  2. Verify the reservation was created successfully:

    gcloud compute reservations describe RESERVATION_NAME
    

    Replace RESERVATION_NAME with the name of the reservation you just created.

  3. Create a cluster having one node to consume any matching reservation:

    gcloud container clusters create CLUSTER_NAME \
        --machine-type=MACHINE_TYPE --num-nodes=1 \
        --reservation-affinity=any
    

    Replace the following:

    • CLUSTER_NAME: the name of the cluster to create.
    • MACHINE_TYPE: the type of machine (name only) to use for the cluster. For example n1-standard-2.
  4. Create a node pool with three nodes to consume any matching reservation:

    gcloud container node-pools create NODEPOOL_NAME \
        --cluster CLUSTER_NAME --num-nodes=3 \
        --machine-type=MACHINE_TYPE --reservation-affinity=any
    

    Replace the following:

    • NODEPOOL_NAME: the name of the node pool to create.
    • CLUSTER_NAME: the name of the cluster you created earlier.
    • MACHINE_TYPE: the type of machine (name only) to use for the node pool. For example n1-standard-2.

The total number of nodes is four, which exceeds the capacity of the reservation. Three of the nodes consume the reservation while the last node takes capacity from the general Compute Engine resource pool.

Consuming a specific single-project reservation

To consume a specific reservation, set the reservation affinity flag to --reservation-affinity=specific and provide the specific reservation name. In this mode, instances must take capacity from the specified reservation in the zone. The request fails if the reservation does not have sufficient capacity.

To create a reservation and instances to consume a specific reservation, perform the following steps:

  1. Create a specific reservation of three VM instances:

    gcloud compute reservations create RESERVATION_NAME \
        --machine-type=MACHINE_TYPE --vm-count=3 \
        --require-specific-reservation
    

    Replace the following:

    • RESERVATION_NAME: the name of the reservation to create.
    • MACHINE_TYPE: the type of machine (name only) to use for the reservation. For example, n1-standard-2.
  2. Create a node pool with a single node to consume a specific single-project reservation:

    gcloud container node-pools create NODEPOOL_NAME \
        --cluster CLUSTER_NAME \
        --machine-type=MACHINE_TYPE --num-nodes=1 \
        --reservation-affinity=specific --reservation=RESERVATION_NAME
    

    Replace the following:

    • NODEPOOL_NAME: the name of the node pool to create.
    • CLUSTER_NAME: the name of the cluster that you created.
    • MACHINE_TYPE: the type of machine (name only) to use for the cluster. For example n1-standard-2.
    • RESERVATION_NAME: the name of the reservation to consume.

Consuming a specific shared reservation

To create a specific shared reservation and consume the shared reservation, perform the following steps:

  1. Follow the steps in Allowing and restricting projects from creating and modifying shared reservations.

  2. Create a specific shared reservation:

    gcloud compute reservations create RESERVATION_NAME \
        --machine-type=MACHINE_TYPE --vm-count=3 \
        --zone=ZONE \
        --require-specific-reservation \
        --project=OWNER_PROJECT_ID \
        --share-setting=projects \
        --share-with=CONSUMER_PROJECT_IDS
    

    Replace the following:

    • RESERVATION_NAME: the name of reservation to create.
    • MACHINE_TYPE: the name of the type of machine to use for the reservation. For example, n1-standard-2.
    • OWNER_PROJECT_ID: the project ID of the project that you want to create this shared reservation. If you omit the --project flag, GKE uses the current project as the owner project by default.
    • CONSUMER_PROJECT_IDS: a comma-separated list of the project IDs of projects that you want to share this reservation with. For example, project-1,project-2. You can include 1 to 100 consumer projects. These projects must be in the same organization as the owner project. Don't include the OWNER_PROJECT_ID, because it can consume this reservation by default.
  3. Consume the shared reservation:

     gcloud container node-pools create NODEPOOL_NAME \
         --cluster CLUSTER_NAME \
         --machine-type=MACHINE_TYPE --num-nodes=1 \
        --reservation-affinity=specific \
        --reservation=projects/OWNER_PROJECT_ID/reservations/RESERVATION_NAME
    

    Replace the following:

    • NODEPOOL_NAME: the name of the node pool to create.
    • CLUSTER_NAME: the name of the cluster that you created.
    • MACHINE_TYPE: the name of the type of machine to use for the cluster. For example n1-standard-2.
    • OWNER_PROJECT_ID: the project ID where the shared reservation is created.
    • RESERVATION_NAME: the name of the specific shared reservation to consume.

Additional considerations for consuming from a specific reservation

When a node pool is created with specific reservation affinity, including default node pools during cluster creation, its size is limited to the capacity of the specific reservation over the node pool's entire lifetime. This affects the following GKE features:

  • Cluster with multiple zones: In regional or multi-zonal clusters, nodes of a node pool can span across multiple zones. Since reservations are single-zonal, multiple reservations are needed. To create a node pool consuming specific reservation in these clusters, you must create a specific reservation with exactly the same name and machine properties in each zone of the node pool.
  • Cluster autoscaling and node pool upgrades: If you don't have extra capacity in the specific reservation, node pool upgrades or autoscaling of the node pool might fail because both operations require creating extra instances. To resolve this, you can change the size of the reservation, or free up some of its bounded resources.

Creating nodes without consuming reservations

To explicitly avoid consuming resources from any reservations, set the affinity to --reservation-affinity=none.

  1. Create a cluster that won't consume any reservation:

    gcloud container clusters create CLUSTER_NAME --reservation-affinity=none
    

    Replace CLUSTER_NAME with the name of the cluster to create.

  2. Create a node pool that won't consume any reservation:

    gcloud container node-pools create NODEPOOL_NAME \
        --cluster CLUSTER_NAME \
        --reservation-affinity=none
    

    Replace the following:

    • NODEPOOL_NAME: the name of the node pool to create.
    • CLUSTER_NAME: the name of the cluster you created earlier.

Following available reservations between zones

When using node pools running in multiple zones with reservations that are not equal between zones, you can use the flag --location_policy=ANY. This ensures that when new nodes are added to the cluster they are created in the zone that still has unused reservations.

TPU reservation

TPU reservations differ from other machine types. The following are TPU-specific aspects you should consider when creating TPU reservations:

  • When using TPUs in GKE, SPECIFIC is the only supported value for the --reservation-affinity flag of gcloud container node-pools create.
  • TPU reservations cannot be shared across projects.

For more information, see TPU reservations.

Cleaning up

To avoid incurring charges to your Cloud Billing account for the resources used in this page:

  1. Delete the clusters you created by running the following command for each of the clusters:

    gcloud container clusters delete CLUSTER_NAME
    
  2. Delete the reservations you created by running the following command for each of the reservations:

    gcloud compute reservations delete RESERVATION_NAME
    

What's next