This page explains how to configure vertical Pod autoscaling in a Google Kubernetes Engine cluster. Vertical Pod autoscaling involves adjusting a Pod's CPU and memory requests.
Overview
You can use the VerticalPodAutoscaler custom resource to analyze and adjust your containers' CPU requests and memory requests. You can configure a VerticalPodAutoscaler to make recommendations for CPU and memory requests, or you can configure it to make automatic changes to your CPU and memory requests.
Before you begin
To prepare for this task, perform the following steps:
- Ensure that you have enabled the Google Kubernetes Engine API. Enable Google Kubernetes Engine API
- Ensure that you have installed the Cloud SDK.
- Set your default project ID:
gcloud config set project [PROJECT_ID]
- If you are working with zonal clusters, set your default compute zone:
gcloud config set compute/zone [COMPUTE_ZONE]
- If you are working with regional clusters, set your default compute region:
gcloud config set compute/region [COMPUTE_REGION]
- Update
gcloud
to the latest version:gcloud components update
Note on API versions
This guide assumes you have the v1 version of the Vertical Pod Autoscaler API installed in your Google Kubernetes Engine cluster. This is available in version 1.14.7-gke.10 or higher and in 1.15.4-gke.15 or higher.
We strongly recommend using this API. For instructions on migrating from older API versions, see the migration guide.
Enabling vertical Pod autoscaling for a cluster
To create a new cluster with vertical Pod autoscaling enabled, enter this command:
gcloud container clusters create [CLUSTER_NAME] --enable-vertical-pod-autoscaling --cluster-version=1.14.7
where [CLUSTER_NAME]
is a name that you choose for your cluster.
To enable vertical Pod autoscaling for an existing cluster, enter this command:
gcloud container clusters update [CLUSTER-NAME] --enable-vertical-pod-autoscaling
where [CLUSTER_NAME]
is the name of the cluster.
Getting resource recommendations
In this exercise, you create a VerticalPodAutoscaler that has an updateMode
of "Off". Then you create a Deployment that has two Pods, each of which has
one container. When the Pods are created, the VerticalPodAutoscaler
analyzes the CPU and memory needs of the containers and records those
recommendations in its status
field. The VerticalPodAutoscaler does not take any
action to update the resource requests for the running containers.
Here is a manifest for the VerticalPodAutoscaler:
apiVersion: autoscaling.k8s.io/v1 kind: VerticalPodAutoscaler metadata: name: my-rec-vpa spec: targetRef: apiVersion: "apps/v1" kind: Deployment name: my-rec-deployment updatePolicy: updateMode: "Off"
Save the manifest in a file named my-rec-vpa.yaml
, and create the
VerticalPodAutoscaler:
kubectl create -f my-rec-vpa.yaml
Here is a manifest for the Deployment:
apiVersion: apps/v1 kind: Deployment metadata: name: my-rec-deployment spec: replicas: 2 selector: matchLabels: app: my-rec-deployment template: metadata: labels: app: my-rec-deployment spec: containers: - name: my-rec-container image: nginx
In the manifest, you can see that there are no CPU or memory requests. You can
also see that the Pods in the Deployment belong to the VerticalPodAutoscaler,
because it points to the target of kind: Deployment
and
name: my-rec-deployment
.
Copy the manifest to a file named my-rec-deployment.yaml
, and create the
Deployment:
kubectl create -f my-rec-deployment.yaml
Wait a minute, and then view the VerticalPodAutoscaler:
kubectl get vpa my-rec-vpa --output yaml
The output shows recommendations for CPU and memory requests:
... recommendation: containerRecommendations: - containerName: my-rec-container lowerBound: cpu: 25m memory: 262144k target: cpu: 25m memory: 262144k upperBound: cpu: 7931m memory: 8291500k ...
Now that you have the recommended CPU and memory requests, you might choose to delete your Deployment, add CPU and memory requests to your Deployment manifest, and start your Deployment again.
Updating resource requests automatically
In this exercise, you create a Deployment that has two Pods. Each Pod has one container that requests 100 milliCPU and 50 mebibytes of memory. Then you create a VerticalPodAutoscaler that automatically adjusts the CPU and memory requests.
Here is a manifest for the Deployment:
apiVersion: apps/v1 kind: Deployment metadata: name: my-auto-deployment spec: replicas: 2 selector: matchLabels: app: my-rec-deployment template: metadata: labels: app: my-auto-deployment spec: containers: - name: my-container image: k8s.gcr.io/ubuntu-slim:0.1 resources: requests: cpu: 100m memory: 50Mi command: ["/bin/sh"] args: ["-c", "while true; do timeout 0.5s yes >/dev/null; sleep 0.5s; done"]
Copy the manifest to a file named my-auto-deployment.yaml
, and create the
Deployment:
kubectl create -f my-auto-deployment.yaml
List the running Pods:
kubectl get pods
The output shows the names of the Pods in my-deployment
:
NAME READY STATUS RESTARTS AGE my-auto-deployment-cbcdd49fb-d6bf9 1/1 Running 0 8s my-auto-deployment-cbcdd49fb-th288 1/1 Running 0 8s
Make a note of your Pod names for later.
The CPU and memory requests for the Deployment are very small, so it's likely that the Deployment would benefit from an increase in resources.
Here is a manifest for a VerticalPodAutoscaler:
apiVersion: autoscaling.k8s.io/v1 kind: VerticalPodAutoscaler metadata: name: my-vpa spec: targetRef: apiVersion: "apps/v1" kind: Deployment name: my-auto-deployment updatePolicy: updateMode: "Auto"
In the manifest, the targetRef
field says any Pod that is controlled by a
Deployment named my-deployment
belongs to this VerticalPodAutoscaler.
The updateMode
field has a value of Auto
, which means that the
VerticalPodAutoscaler can update CPU and memory requests during the life of a
Pod. That is, the VerticalPodAutoscaler can delete a Pod, adjust the CPU
and memory requests, and then start a new Pod.
Copy the manifest to a file named my-vpa.yaml
, and create the
VerticalPodAutoscaler:
kubectl create -f my-vpa.yaml
Wait a few minutes, and view the running Pods again:
kubectl get pods
Notice that the Pod names have changed. If the Pod names have not yet changed, wait a bit longer, and list the running Pods again.
Get detailed information about one of your running Pods:
kubectl get pod [POD_NAME] --output yaml
where [POD_NAME]
is the name of one of your Pods.
In the output, you can see that the VerticalPodAutoscaler has increased the memory and CPU requests. You can also see an annotation that documents the update:
apiVersion: v1 kind: Pod metadata: annotations: vpaUpdates: 'Pod resources updated by my-vpa: container 0: cpu capped to node capacity, memory capped to node capacity, cpu request, memory request' ... spec: containers: ... resources: requests: cpu: 510m memory: 262144k ...
Get detailed information about the VerticalPodAutoscaler:
kubectl get vpa my-vpa --output yaml
The output shows three sets of recommendations for CPU and memory requests: lower bound, target, and upper bound:
... recommendation: containerRecommendations: - containerName: my-container lowerBound: cpu: 536m memory: 262144k target: cpu: 587m memory: 262144k upperBound: cpu: 27854m memory: "545693548"
The target
recommendation says that the container will run optimally if it
requests 587 milliCPU and 262144 kilobytes of memory.
The VerticalPodAutoscaler uses the lowerBound
and upperBound
recommendations to decide whether to delete a Pod and replace it with a new
Pod. If a Pod has requests less than the lower bound or greater than the upper
bound, the VerticalPodAutoscaler deletes the Pod and replaces it with a Pod that
has the target recommendation.