Run fault-tolerant workloads at lower costs in Spot Pods


This page shows you how to run fault-tolerant workloads at lower costs by using Spot Pods in your Google Kubernetes Engine (GKE) Autopilot clusters.

Overview

In GKE Autopilot clusters, Spot Pods are Pods that run on nodes backed by Compute Engine Spot VMs. Spot Pods are priced lower than standard Autopilot Pods, but can be evicted by GKE whenever compute resources are required to run standard Pods.

Spot Pods are ideal for running stateless, batch, or fault-tolerant workloads at lower costs compared to running those workloads as standard Pods. To use Spot Pods in Autopilot clusters, modify the manifest with your Pod specification to request Spot Pods.

You can run Spot Pods on the default general-purpose Autopilot compute class as well as on specialized compute classes that meet specific hardware requirements. For information about these compute classes, refer to Compute classes in Autopilot.

To learn more about the pricing for Spot Pods in Autopilot clusters, see Google Kubernetes Engine pricing.

Spot Pods are excluded from the Autopilot Service Level Agreement.

Benefits

Using Spot Pods in your Autopilot clusters provides you with the following benefits:

  • Lower pricing than running the same workloads on standard Autopilot Pods.
  • GKE automatically manages autoscaling and scheduling.
  • GKE automatically taints nodes that run Spot Pods to ensure that standard Pods, like your critical workloads, aren't scheduled on those nodes. Your deployments that do use Spot Pods are automatically updated with a corresponding toleration.

Before you begin

Before you start, make sure you have performed the following tasks:

  • Enable the Google Kubernetes Engine API.
  • Enable Google Kubernetes Engine API
  • If you want to use the Google Cloud CLI for this task, install and then initialize the gcloud CLI. If you previously installed the gcloud CLI, get the latest version by running gcloud components update.

Request Spot Pods in your Autopilot workloads

To request that your Pods run as Spot Pods, use the cloud.google.com/gke-spot=true label in a nodeSelector or node affinity in your Pod specification. GKE automatically provisions nodes that can run Spot Pods.

Spot Pods can be evicted and terminated at any time, for example if the compute resources are required elsewhere in Google Cloud. When a termination occurs, Spot Pods on the terminating node can request up to a 15 second grace period before termination, which is granted on a best effort basis, by specifying the terminationGracePeriodSeconds field.

The maximum grace period given to Spot Pods during preemption is 15 seconds. Requesting more than 15 seconds in terminationGracePeriodSeconds doesn't grant more than 15 seconds during preemption. On eviction, your Pod is sent the SIGTERM signal, and should take steps to shutdown during the grace period.

For Autopilot, GKE also automatically taints the nodes created to run Spot Pods and modifies those workloads with the corresponding toleration. The taint prevents standard Pods from being scheduled on nodes that run Spot Pods.

Use a nodeSelector to require Spot Pods

You can use a nodeSelector to require Spot Pods in a Deployment. Add the cloud.google.com/gke-spot=true label to your Deployment, such as in the following example:

apiVersion: batch/v1
kind: Job
metadata:
  name: pi
spec:
  template:
    metadata:
      labels:
        app: pi
    spec:
      nodeSelector:
        cloud.google.com/gke-spot: "true"
      terminationGracePeriodSeconds: 15
      containers:
      - name: pi
        image: perl:5.34.0
        command: ["perl",  "-Mbignum=bpi", "-wle", "print bpi(2000)"]
      restartPolicy: Never
  backoffLimit: 4

Use node affinity to request Spot Pods

Alternatively, you can use node affinity to request Spot Pods. Node affinity provides you with a more extensible way to select nodes to run your workloads. For example, you can combine several selection criteria to get finer control over where your Pods run. When you use node affinity to request Spot Pods, you can specify the type of node affinity to use, as follows:

  • requiredDuringSchedulingIgnoredDuringExecution: Must use Spot Pods.
  • preferredDuringSchedulingIgnoredDuringExecution: Use Spot Pods on a best-effort basis.

To use node affinity to require Spot Pods in a Deployment, add the following nodeAffinity rule to your Deployment manifest:

apiVersion: batch/v1
kind: Job
metadata:
  name: pi
spec:
  template:
    metadata:
      labels:
        app: pi
    spec:
      terminationGracePeriodSeconds: 15
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: cloud.google.com/gke-spot
                operator: In
                values:
                - "true"
      containers:
      - name: pi
        image: perl:5.34.0
        command: ["perl",  "-Mbignum=bpi", "-wle", "print bpi(2000)"]
      restartPolicy: Never
  backoffLimit: 4

Requesting Spot Pods on a best-effort basis

To use node affinity to request Spot Pods on a best-effort basis, use preferredDuringSchedulingIgnoredDuringExecution. When you request Spot Pods on a preferred basis, GKE schedules your Pods based on the following order:

  1. Existing nodes that can run Spot Pods that have available allocatable capacity.
  2. Existing standard nodes that have available allocatable capacity.
  3. New nodes that can run Spot Pods, if the compute resources are available.
  4. New standard nodes.

Because GKE prefers existing standard nodes that have allocatable capacity over creating new nodes for Spot Pods, you might notice more Pods running as standard Pods than as Spot Pods, which prevents you from taking full advantage of the lower pricing of Spot Pods.

Requests for preemptible Pods

Autopilot clusters support requests for preemptible Pods using the cloud.google.com/gke-preemptible selector. Pods that use this selector are automatically migrated to Spot Pods, and the selector is changed to cloud.google.com/gke-spot.

Find and delete terminated Pods

During graceful Pod termination, the kubelet assigns a Failed status and a Shutdown reason to the terminated Pods. When the number of terminated Pods reaches a threshold of 1000, garbage collection cleans up the Pods. You can also delete shutdown Pods manually using the following command:

kubectl get pods --all-namespaces | grep -i shutdown | awk '{print $1, $2}' | xargs -n2 kubectl delete pod -n

What's next