This page shows you how to run fault-tolerant workloads at lower costs by using Spot Pods in your Google Kubernetes Engine (GKE) Autopilot clusters.
Overview
In GKE Autopilot clusters, Spot Pods are Pods that run on nodes backed by Compute Engine Spot VMs. Spot Pods are priced lower than standard Autopilot Pods, but can be evicted by GKE whenever compute resources are required to run standard Pods.
Spot Pods are ideal for running stateless, batch, or fault-tolerant workloads at lower costs compared to running those workloads as standard Pods. To use Spot Pods in Autopilot clusters, modify the manifest with your Pod specification to request Spot Pods.
You can run Spot Pods on the default general-purpose Autopilot compute class as well as on specialized compute classes that meet specific hardware requirements. For information about these compute classes, refer to Compute classes in Autopilot.
To learn more about the pricing for Spot Pods in Autopilot clusters, see Google Kubernetes Engine pricing.
Spot Pods are excluded from the Autopilot Service Level Agreement.
Benefits
Using Spot Pods in your Autopilot clusters provides you with the following benefits:
- Lower pricing than running the same workloads on standard Autopilot Pods.
- GKE automatically manages autoscaling and scheduling.
- GKE automatically taints nodes that run Spot Pods to ensure that standard Pods, like your critical workloads, aren't scheduled on those nodes. Your deployments that do use Spot Pods are automatically updated with a corresponding toleration.
Before you begin
Before you start, make sure you have performed the following tasks:
- Enable the Google Kubernetes Engine API. Enable Google Kubernetes Engine API
- If you want to use the Google Cloud CLI for this task,
install and then
initialize the
gcloud CLI. If you previously installed the gcloud CLI, get the latest
version by running
gcloud components update
.
Request Spot Pods in your Autopilot workloads
To request that your Pods run as Spot Pods, use the
cloud.google.com/gke-spot=true
label in a
nodeSelector
or node
affinity
in your Pod specification. GKE automatically provisions nodes
that can run Spot Pods.
Spot Pods can be evicted and terminated at any time, for example if the
compute resources are required elsewhere in Google Cloud. When a termination
occurs, Spot Pods on the terminating node can request up to a 15 second
grace period before termination, which is granted on a best effort basis, by
specifying the terminationGracePeriodSeconds
field.
The maximum grace period given to Spot Pods during preemption is 15
seconds. Requesting more than 15 seconds in terminationGracePeriodSeconds
doesn't grant more than 15 seconds during preemption. On eviction, your Pod is
sent the SIGTERM
signal,
and should take steps to shutdown during the grace period.
For Autopilot, GKE also automatically taints the nodes created to run Spot Pods and modifies those workloads with the corresponding toleration. The taint prevents standard Pods from being scheduled on nodes that run Spot Pods.
Use a nodeSelector to require Spot Pods
You can use a nodeSelector to require Spot Pods in a Deployment. Add the
cloud.google.com/gke-spot=true
label to your Deployment, such as in the following
example:
apiVersion: batch/v1
kind: Job
metadata:
name: pi
spec:
template:
metadata:
labels:
app: pi
spec:
nodeSelector:
cloud.google.com/gke-spot: "true"
terminationGracePeriodSeconds: 15
containers:
- name: pi
image: perl:5.34.0
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never
backoffLimit: 4
Use node affinity to request Spot Pods
Alternatively, you can use node affinity to request Spot Pods. Node affinity provides you with a more extensible way to select nodes to run your workloads. For example, you can combine several selection criteria to get finer control over where your Pods run. When you use node affinity to request Spot Pods, you can specify the type of node affinity to use, as follows:
requiredDuringSchedulingIgnoredDuringExecution
: Must use Spot Pods.preferredDuringSchedulingIgnoredDuringExecution
: Use Spot Pods on a best-effort basis.
To use node affinity to require Spot Pods in a Deployment, add the
following nodeAffinity
rule to your Deployment manifest:
apiVersion: batch/v1
kind: Job
metadata:
name: pi
spec:
template:
metadata:
labels:
app: pi
spec:
terminationGracePeriodSeconds: 15
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: cloud.google.com/gke-spot
operator: In
values:
- "true"
containers:
- name: pi
image: perl:5.34.0
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never
backoffLimit: 4
Requesting Spot Pods on a best-effort basis
To use node affinity to request Spot Pods on a best-effort basis, use
preferredDuringSchedulingIgnoredDuringExecution
.
When you request Spot Pods on a preferred basis, GKE
schedules your Pods based on the following order:
- Existing nodes that can run Spot Pods that have available allocatable capacity.
- Existing standard nodes that have available allocatable capacity.
- New nodes that can run Spot Pods, if the compute resources are available.
- New standard nodes.
Because GKE prefers existing standard nodes that have allocatable capacity over creating new nodes for Spot Pods, you might notice more Pods running as standard Pods than as Spot Pods, which prevents you from taking full advantage of the lower pricing of Spot Pods.
Requests for preemptible Pods
Autopilot clusters support requests for preemptible Pods using the
cloud.google.com/gke-preemptible
selector. Pods that use this selector are
automatically migrated to Spot Pods, and the selector is changed to
cloud.google.com/gke-spot
.
Find and delete terminated Pods
During graceful Pod termination, the kubelet assigns a Failed
status and a
Shutdown
reason to the terminated Pods. When the number of terminated Pods
reaches a threshold of 1000, garbage
collection
cleans up the Pods. You can also delete shutdown Pods manually using the
following command:
kubectl get pods --all-namespaces | grep -i shutdown | awk '{print $1, $2}' | xargs -n2 kubectl delete pod -n
Stop workloads from using Spot Pods
If you have existing Spot Pods that you want to update to run as standard Pods, you can use one of the following methods:
- Recreate the workload: Delete the Deployment, remove the lines in the manifest that select Spot Pods, and then apply the updated Deployment manifest to the cluster.
- Edit the workload: Edit the Deployment specification while the Pods are running in the cluster.
With both of these methods, you might experience minor workload disruptions.
Recreate the workload
The following steps show you how to delete the existing Deployment and apply an updated manifest to the cluster. You can also use these steps for other types of Kubernetes workloads, like Jobs.
To ensure that GKE places the updated Pods on the correct type of node, you must export the existing state of the workload from the Kubernetes API server to a file and edit that file.
Write the workload specification to a YAML file:
kubectl get deployment DEPLOYMENT_NAME -o yaml > DEPLOYMENT_NAME-on-demand.yaml
Replace
DEPLOYMENT_NAME
with the name of your deployment. For other types of workloads, like Jobs or Pods, use the corresponding resource name in yourkubectl get
command, likekubectl get pod
.Open the YAML file in a text editor:
vi DEPLOYMENT_NAME-on-demand.yaml
Remove the nodeSelector for Spot Pods and the toleration that GKE added for Spot Pods from the file:
apiVersion: apps/v1 kind: Deployment metadata: annotations: # lines omitted for clarity spec: progressDeadlineSeconds: 600 replicas: 6 revisionHistoryLimit: 10 selector: matchLabels: pod: nginx-pod strategy: rollingUpdate: maxSurge: 25% maxUnavailable: 25% type: RollingUpdate template: metadata: # lines omitted for clarity spec: containers: - image: nginx imagePullPolicy: Always name: web-server resources: limits: ephemeral-storage: 1Gi requests: cpu: 500m ephemeral-storage: 1Gi memory: 2Gi securityContext: capabilities: drop: - NET_RAW terminationMessagePath: /dev/termination-log terminationMessagePolicy: File dnsPolicy: ClusterFirst nodeSelector: cloud.google.com/gke-spot: "true" restartPolicy: Always schedulerName: default-scheduler securityContext: seccompProfile: type: RuntimeDefault terminationGracePeriodSeconds: 15 tolerations: - effect: NoSchedule key: kubernetes.io/arch operator: Equal value: amd64 - effect: NoSchedule key: cloud.google.com/gke-spot operator: Equal value: "true" status: #lines omitted for clarity
You must remove both the toleration and the nodeSelector to indicate to GKE that the Pods must run on on-demand nodes instead of on Spot nodes.
Save the updated manifest.
Delete and re-apply the Deployment manifest to the cluster:
kubectl replace -f DEPLOYMENT_NAME-on-demand.yaml
The duration of this operation depends on the number of Pods that GKE needs to terminate and clean up.
Edit the workload in-place
The following steps show you how to edit a running Deployment in-place to indicate to GKE that the Pods must run on on-demand nodes. You can also use these steps for other types of Kubernetes workloads, like Jobs.
You must edit the workload object in the Kubernetes API because GKE automatically adds a toleration for Spot Pods to the workload specification during workload admission.
Open your workload manifest for editing in a text editor:
kubectl edit deployment/DEPLOYMENT_NAME
Replace
DEPLOYMENT_NAME
with the name of the Deployment. For other types of workloads, like Jobs or Pods, use the corresponding resource name in yourkubectl edit
command, likekubectl edit pod/POD_NAME
.In your text editor, delete the node selector or node affinity rule for Spot Pods and the toleration that GKE added to the manifest, like in the following example:
apiVersion: apps/v1 kind: Deployment metadata: name: example-deployment namespace: default spec: replicas: 1 selector: matchLabels: type: dev template: metadata: labels: type: dev spec: nodeSelector: cloud.google.com/gke-spot: "true" tolerations: - effect: NoSchedule key: cloud.google.com/gke-spot operator: Equal value: "true" containers: - name: nginx image: nginx ports: - containerPort: 80
Save the updated manifest and close the text editor. The updated object configuration indicates to GKE that the Pods must run on on-demand nodes. GKE recreates the Pods to place them on new on-demand nodes.
Verify that workloads run on on-demand nodes
To verify that your updated workloads no longer run on Spot Pods, inspect the workload and look for the toleration for Spot Pods:
Inspect the workload:
kubectl describe deployment DEPLOYMENT_NAME
The output doesn't display an entry for cloud.google.com/gke-spot
in the
spec.tolerations
field.
What's next
- Learn more about Autopilot cluster architecture.
- Learn about the lifecycle of Pods.
- Read about Spot VMs in GKE Standard clusters.