This page explains how to configure vertical Pod autoscaling in a Google Kubernetes Engine cluster. Vertical Pod autoscaling involves adjusting a Pod's CPU and memory requests.
Overview
You can use the VerticalPodAutoscaler
custom resource to analyze and adjust your containers'
CPU requests
and
memory requests.
You can configure a VerticalPodAutoscaler
to make
recommendations for CPU and memory requests, or you can configure it to make
automatic changes to your CPU and memory requests.
Before you begin
Before you start, make sure you have performed the following tasks:
- Ensure that you have enabled the Google Kubernetes Engine API. Enable Google Kubernetes Engine API
- Ensure that you have installed the Cloud SDK.
Set up default gcloud
settings using one of the following methods:
- Using
gcloud init
, if you want to be walked through setting defaults. - Using
gcloud config
, to individually set your project ID, zone, and region.
Using gcloud init
If you receive the error One of [--zone, --region] must be supplied: Please specify
location
, complete this section.
-
Run
gcloud init
and follow the directions:gcloud init
If you are using SSH on a remote server, use the
--console-only
flag to prevent the command from launching a browser:gcloud init --console-only
-
Follow the instructions to authorize
gcloud
to use your Google Cloud account. - Create a new configuration or select an existing one.
- Choose a Google Cloud project.
- Choose a default Compute Engine zone.
Using gcloud config
- Set your default project ID:
gcloud config set project PROJECT_ID
- If you are working with zonal clusters, set your default compute zone:
gcloud config set compute/zone COMPUTE_ZONE
- If you are working with regional clusters, set your default compute region:
gcloud config set compute/region COMPUTE_REGION
- Update
gcloud
to the latest version:gcloud components update
Note on API versions
This guide assumes you have the v1 version of the VerticalPodAutoscaler API installed in your Google Kubernetes Engine cluster. This is available in version 1.14.7-gke.10 or higher and in 1.15.4-gke.15 or higher.
We strongly recommend using this API. For instructions on migrating from older API versions, see the migration guide.
Enabling vertical Pod autoscaling for a cluster
To create a new cluster with vertical Pod autoscaling enabled, enter this command:
gcloud container clusters create cluster-name \
--enable-vertical-pod-autoscaling --cluster-version=1.14.7
where cluster-name is a name that you choose for your cluster.
To enable vertical Pod autoscaling for an existing cluster, enter this command:
gcloud container clusters update cluster-name --enable-vertical-pod-autoscaling
where cluster-name is the name of the cluster.
Enabling or disabling vertical Pod autoscaling causes a control plane restart.
Getting resource recommendations
In this exercise, you create a VerticalPodAutoscaler
object that has an updateMode
of "Off". Then you create a Deployment that has two Pods, each of which has
one container. When the Pods are created, the VerticalPodAutoscaler
analyzes the CPU and memory needs of the containers and records those
recommendations in its status
field. The VerticalPodAutoscaler
does not take any
action to update the resource requests for the running containers.
Here is a manifest for the VerticalPodAutoscaler
:
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: my-rec-vpa
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment
name: my-rec-deployment
updatePolicy:
updateMode: "Off"
Save the manifest in a file named my-rec-vpa.yaml
, and create the
VerticalPodAutoscaler
:
kubectl create -f my-rec-vpa.yaml
Here is a manifest for the Deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-rec-deployment
spec:
replicas: 2
selector:
matchLabels:
app: my-rec-deployment
template:
metadata:
labels:
app: my-rec-deployment
spec:
containers:
- name: my-rec-container
image: nginx
In the manifest, you can see that there are no CPU or memory requests. You can
also see that the Pods in the Deployment belong to the VerticalPodAutoscaler
,
because it points to the target of kind: Deployment
and
name: my-rec-deployment
.
Copy the manifest to a file named my-rec-deployment.yaml
, and create the
Deployment:
kubectl create -f my-rec-deployment.yaml
Wait a minute, and then view the VerticalPodAutoscaler
:
kubectl get vpa my-rec-vpa --output yaml
The output shows recommendations for CPU and memory requests:
...
recommendation:
containerRecommendations:
- containerName: my-rec-container
lowerBound:
cpu: 25m
memory: 262144k
target:
cpu: 25m
memory: 262144k
upperBound:
cpu: 7931m
memory: 8291500k
...
Now that you have the recommended CPU and memory requests, you might choose to delete your Deployment, add CPU and memory requests to your Deployment manifest, and start your Deployment again.
Opting out specific containers
In this exercise you create a VerticalPodAutoscaler
that has a specific
container opted out. Then you create a Deployment that has one Pod with
two containers. When the Pod is created, the VerticalPodAutoscaler
creates and applies a recommendation only for single container,
ignoring the one that was opted out.
Here is a manifest for the VerticalPodAutoscaler
:
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: my-opt-vpa
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment
name: my-opt-deployment
updatePolicy:
updateMode: "Auto"
resourcePolicy:
containerPolicies:
- containerName: my-opt-sidecar
mode: "Off"
Note that the VerticalPodAutoscaler
has additional information in the
resourcePolicy
section. mode "Off"
turns off recommendations for a container
with a specified name, in this case my-opt-sidecar
.
Save the manifest in a file named my-opt-vpa.yaml
, and create the
VerticalPodAutoscaler
:
kubectl create -f my-opt-vpa.yaml
Here is a manifest for the Deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-opt-deployment
spec:
replicas: 1
selector:
matchLabels:
app: my-opt-deployment
template:
metadata:
labels:
app: my-opt-deployment
spec:
containers:
- name: my-opt-container
image: nginx
- name: my-opt-sidecar
image: busybox
command: ["sh","-c","while true; do echo Doing sidecar stuff!; sleep 60; done"]
Copy the manifest to a file named my-opt-deployment.yaml
, and create the
Deployment:
kubectl create -f my-opt-deployment.yaml
Wait a minute, and then view the VerticalPodAutoscaler
:
kubectl get vpa my-opt-vpa --output yaml
The output shows recommendations for CPU and memory requests:
...
recommendation:
containerRecommendations:
- containerName: my-opt-container
...
Note that there are recommendations only for one container.
There are no recommendations for my-opt-sidecar
,
due to the container being opt-out.
The VerticalPodAutoscaler
never updates resources on opted out containers.
If you wait a few minutes, the Pod recreates but only one container
has updated resource requests.
Updating resource requests automatically
In this exercise, you create a Deployment that has two Pods. Each Pod has one
container that requests 100 milliCPU and 50 mebibytes of memory. Then you
create a VerticalPodAutoscaler
that automatically adjusts the CPU and memory
requests.
Here is a manifest for the Deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-auto-deployment
spec:
replicas: 2
selector:
matchLabels:
app: my-auto-deployment
template:
metadata:
labels:
app: my-auto-deployment
spec:
containers:
- name: my-container
image: k8s.gcr.io/ubuntu-slim:0.1
resources:
requests:
cpu: 100m
memory: 50Mi
command: ["/bin/sh"]
args: ["-c", "while true; do timeout 0.5s yes >/dev/null; sleep 0.5s; done"]
Copy the manifest to a file named my-auto-deployment.yaml
, and create the
Deployment:
kubectl create -f my-auto-deployment.yaml
List the running Pods:
kubectl get pods
The output shows the names of the Pods in my-deployment
:
NAME READY STATUS RESTARTS AGE
my-auto-deployment-cbcdd49fb-d6bf9 1/1 Running 0 8s
my-auto-deployment-cbcdd49fb-th288 1/1 Running 0 8s
Make a note of your Pod names for later.
The CPU and memory requests for the Deployment are very small, so it's likely that the Deployment would benefit from an increase in resources.
Here is a manifest for a VerticalPodAutoscaler
:
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: my-vpa
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment
name: my-auto-deployment
updatePolicy:
updateMode: "Auto"
In the manifest, the targetRef
field says any Pod that is controlled by a
Deployment named my-deployment
belongs to this VerticalPodAutoscaler
.
The updateMode
field has a value of Auto
, which means that the
VerticalPodAutoscaler
can update CPU and memory requests during the life of a
Pod. That is, the VerticalPodAutoscaler
can delete a Pod, adjust the CPU
and memory requests, and then start a new Pod.
Copy the manifest to a file named my-vpa.yaml
, and create the
VerticalPodAutoscaler
:
kubectl create -f my-vpa.yaml
Wait a few minutes, and view the running Pods again:
kubectl get pods
Notice that the Pod names have changed. If the Pod names have not yet changed, wait a bit longer, and list the running Pods again.
Get detailed information about one of your running Pods:
kubectl get pod pod-name --output yaml
where pod-name is the name of one of your Pods.
In the output, you can see that the VerticalPodAutoscaler
has increased the
memory and CPU requests. You can also see an annotation that documents the
update:
apiVersion: v1
kind: Pod
metadata:
annotations:
vpaUpdates: 'Pod resources updated by my-vpa: container 0: cpu capped to node
capacity, memory capped to node capacity, cpu request, memory request'
...
spec:
containers:
...
resources:
requests:
cpu: 510m
memory: 262144k
...
Get detailed information about the VerticalPodAutoscaler
:
kubectl get vpa my-vpa --output yaml
The output shows three sets of recommendations for CPU and memory requests: lower bound, target, and upper bound:
...
recommendation:
containerRecommendations:
- containerName: my-container
lowerBound:
cpu: 536m
memory: 262144k
target:
cpu: 587m
memory: 262144k
upperBound:
cpu: 27854m
memory: "545693548"
The target
recommendation says that the container will run optimally if it
requests 587 milliCPU and 262144 kilobytes of memory.
The VerticalPodAutoscaler
uses the lowerBound
and upperBound
recommendations to decide whether to delete a Pod and replace it with a new
Pod. If a Pod has requests less than the lower bound or greater than the upper
bound, the VerticalPodAutoscaler
deletes the Pod and replaces it with a Pod that
has the target recommendation.
Cleaning up
Disable vertical Pod autoscaling:
gcloud container clusters update cluster-name --no-enable-vertical-pod-autoscaling
where cluster-name is the name of the cluster.
Optionally, delete the cluster.
What's next
- Learn more about Vertical Pod autoscaling.
- Learn how to Assign CPU Resources to containers and Pods.
- Learn how to Assign memory resources to containers and Pods.
- Learn about Scaling an application.
- Learn about Autoscaling deployments with custom metrics.
- Learn about Cluster autoscaler.
- Learn about Multidimensional Pod Autoscaler.