This page shows you how to run fault-tolerant, stateless, or batch workloads at lower costs by using Spot VMs in your Google Kubernetes Engine (GKE) clusters and node pools.
Overview
Spot VMs are Compute Engine virtual machines (VMs) that are priced lower than the default standard VMs and provide no guarantee of availability. Spot VMs offer the same machine types and options as standard Compute Engine VMs. Compute Engine can reclaim Spot VMs at any time due to system events, such as when the resources are needed for standard VMs.
To learn more about Spot VMs in GKE, see Spot VMs.
Spot VMs replace the need to use preemptible VMs to run stateless, batch, or fault-tolerant workloads. In contrast to preemptible VMs, which expire after 24 hours, Spot VMs have no expiration time. Spot VMs are terminated when Compute Engine requires the resources to run standard VMs.
Spot VMs are also supported on GKE Autopilot clusters through Spot Pods. With Spot Pods, Autopilot automatically schedules and manages workloads on Spot VMs.
Limitations
- The kubelet graceful node shutdown feature is only enabled on clusters running GKE version 1.20 and later. For GKE versions prior to 1.20, you can use the Kubernetes on GCP Node Termination Event Handler to gracefully terminate your Pods when Spot VMs are preempted.
- Spot VMs do not support Windows Server node pools.
Before you begin
Before you start, make sure you have performed the following tasks:
- Enable the Google Kubernetes Engine API. Enable Google Kubernetes Engine API
- If you want to use the Google Cloud CLI for this task,
install and then
initialize the
gcloud CLI. If you previously installed the gcloud CLI, get the latest
version by running
gcloud components update
.
Create a cluster with Spot VMs
You can create a new cluster using Spot VMs with the Google Cloud CLI or the Google Cloud console.
gcloud
Create a new cluster which uses Spot VMs in the default node pool instead of standard VMs:
gcloud container clusters create CLUSTER_NAME \
--spot
Replace CLUSTER_NAME
with the name of your new cluster.
Console
To create a new cluster with a node pool using Spot VMs, perform the following steps:
Go to the Google Kubernetes Engine page in the Google Cloud console.
Click add_box Create.
On the Create cluster dialog, next to GKE Standard, click Configure.
From the navigation menu, in the Node pools section, click the name of the node pool you want to configure, and then click Nodes.
Select the Enable Spot VMs checkbox.
Configure the cluster as needed, and then click Create.
Create a node pool with Spot VMs
You can create new node pools using Spot VMs with the gcloud CLI or Google Cloud console. You can only enable Spot VMs on new node pools. You cannot enable or disable Spot VMs on existing node pools.
gcloud
Create a new node pool using Spot VMs:
gcloud container node-pools create POOL_NAME \
--cluster=CLUSTER_NAME \
--spot
Replace POOL_NAME
with the name of your new node pool.
Console
To create a new node pool using Spot VMs, perform the following steps:
Go to the Google Kubernetes Engine page in the Google Cloud console.
In the cluster list, click the name of the cluster you want to modify.
Click
Add node pool.From the navigation menu, click Nodes.
Select the Enable Spot VMs checkbox.
Configure the node pool as needed, and then click Create.
Schedule workloads on Spot VMs
GKE adds the cloud.google.com/gke-spot=true
and
cloud.google.com/gke-provisioning=spot
(for nodes running
GKE version 1.25.5-gke.2500 or later)
labels
to nodes that use Spot VMs. You can filter for this label in your Pod spec
using either the nodeSelector
field in your Pod spec or node affinity.
In the following example, you create a cluster with two node pools, one of which
uses Spot VMs. Then, you deploy a stateless nginx
application onto the
Spot VMs, using a nodeSelector
to control where GKE places
the Pods.
Create a new cluster with the default node pool using standard VMs:
gcloud container clusters create CLUSTER_NAME
Replace
CLUSTER_NAME
with the name of your new cluster.Get credentials for the cluster:
gcloud container clusters get-credentials CLUSTER_NAME
Create a node pool using Spot VMs:
gcloud container node-pools create POOL_NAME \ --num-nodes=1 \ --spot
Replace
POOL_NAME
with the name of your new node pool.Save the following manifest as a file named
pi-app.yaml
:apiVersion: batch/v1 kind: Job metadata: name: pi spec: template: metadata: labels: app: pi spec: nodeSelector: cloud.google.com/gke-spot: "true" terminationGracePeriodSeconds: 25 containers: - name: pi image: perl:5.34.0 command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"] restartPolicy: Never backoffLimit: 4
In this manifest, the
nodeSelector
field tells GKE to only schedule Pods on nodes that use Spot VMs.Apply the manifest to your cluster:
kubectl apply -f pi-app.yaml
Describe the Pod:
kubectl describe pod pi
The output is similar to the following:
Name: pi-kjbr9 Namespace: default Priority: 0 Node: gke-cluster-2-spot-pool-fb434072-44ct ... Labels: app=pi job-name=pi Status: Succeeded ... Controlled By: Job/pi Containers: ... Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: ... Node-Selectors: cloud.google.com/gke-spot=true Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 4m3s default-scheduler Successfully assigned default/pi-kjbr9 to gke-cluster-2-spot-pool-fb434072-44ct Normal Pulling 4m2s kubelet Pulling image "perl:5.34.0" Normal Pulled 3m43s kubelet Successfully pulled image "perl:5.34.0" in 18.481761978s Normal Created 3m43s kubelet Created container pi Normal Started 3m43s kubelet Started container pi
The
Node
field shows that GKE only schedules your Pods on nodes that use Spot VMs.
Use taints and tolerations for Spot VMs
As a best practice, create clusters with at least one node pool without Spot VMs where you can place system workloads like DNS. You can use node taints and the corresponding tolerations to tell GKE to avoid placing certain workloads on Spot VMs.
To create a node pool with nodes that use Spot VMs and have node taints, use the
--node-taints
flag when creating the node pool:gcloud container node-pools create POOL_NAME \ --node-taints=cloud.google.com/gke-spot="true":NoSchedule --spot
To add the corresponding toleration to the Pods that you want to schedule to Spot VMs, modify your deployments and add the following to your Pod specification:
tolerations: - key: cloud.google.com/gke-spot operator: Equal value: "true" effect: NoSchedule
GKE only schedules Pods with this toleration onto the Spot VMs with the added node taint.
What's next
- Learn how to run a GKE application on Spot VMs with on-demand nodes as fallback.
- Learn more about Spot VMs in GKE.
- Take a tutorial about deploying a batch workload using Spot VMs in GKE.