Configuring vertical pod autoscaling

This page explains how to configure vertical pod autoscaling in a Google Kubernetes Engine cluster. Vertical pod autoscaling involves adjusting a Pod's CPU and memory requests.

Overview

You can use the VerticalPodAutoscaler custom resource to analyze and adjust your containers' CPU requests and memory requests. You can configure a VerticalPodAutoscaler to make recommendations for CPU and memory requests, or you can configure it to make automatic changes to your CPU and memory requests.

Before you begin

To prepare for this task, perform the following steps:

  • Ensure that you have enabled the Google Kubernetes Engine API.
  • Enable Google Kubernetes Engine API
  • Ensure that you have installed the Cloud SDK.
  • Set your default project ID:
    gcloud config set project [PROJECT_ID]
  • If you are working with zonal clusters, set your default compute zone:
    gcloud config set compute/zone [COMPUTE_ZONE]
  • If you are working with regional clusters, set your default compute region:
    gcloud config set compute/region [COMPUTE_REGION]
  • Update gcloud to the latest version:
    gcloud components update

Note on API versions

This guide assumes you have the v1beta2 version of the Vertical Pod Autoscaler API installed in your Google Kubernetes Engine cluster. This is available in 1.11 starting with version 1.11.8 and in all clusters with version 1.12.6 or higher.

We strongly recommend using this API. For instructions on migrating from older API versions, see the migration guide.

Enabling vertical pod autoscaling for a cluster

To create a new cluster with vertical pod autoscaling enabled, enter this command:

gcloud beta container clusters create [CLUSTER_NAME] --enable-vertical-pod-autoscaling --cluster-version=1.11.8

where [CLUSTER_NAME] is a name that you choose for your cluster.

To enable vertical pod autoscaling for an existing cluster, enter this command:

cloud beta container clusters update [CLUSTER-NAME] --enable-vertical-pod-autoscaling

where [CLUSTER_NAME] is the name of the cluster.

Getting resource recommendations

In this exercise, you create a VerticalPodAutoscaler that has an updateMode of "Off". Then you create a Deployment that has two Pods, each of which has one container. When the Pods are created, the VerticalPodAutoscaler analyzes the CPU and memory needs of the containers and records those recommendations in its status field. The VerticalPodAutoscaler does not take any action to update the resource requests for the running containers.

Here is a manifest for the VerticalPodAutoscaler:

apiVersion: autoscaling.k8s.io/v1beta2
kind: VerticalPodAutoscaler
metadata:
  name: my-rec-vpa
spec:
  targetRef:
    apiVersion: "extensions/v1beta1"
    kind:       Deployment
    name:       my-rec-deployment
  updatePolicy:
    updateMode: "Off"

Save the manifest in a file named my-rec-vpa.yaml, and create the VerticalPodAutoscaler:

kubectl create -f my-rec-vpa.yaml

Here is a manifest for the Deployment:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: my-rec-deployment
spec:
  replicas: 2
  template:
    metadata:
      labels:
        app: my-rec-deployment
    spec:
      containers:
      - name: my-rec-container
        image: nginx

In the manifest, you can see that there are no CPU or memory requests. You can also see that the Pods in the Deployment belong to the VerticalPodAutoscaler, because it points to the target of kind: Deployment and name: my-rec-deployment.

Copy the manifest to a file named my-rec-deployment.yaml, and create the Deployment:

kubectl create -f my-rec-deployment.yaml

Wait a minute, and then view the VerticalPodAutoscaler:

kubectl get vpa my-rec-vpa --output yaml

The output shows recommendations for CPU and memory requests:

...
  recommendation:
    containerRecommendations:
    - containerName: my-rec-container
      lowerBound:
        cpu: 25m
        memory: 262144k
      target:
        cpu: 25m
        memory: 262144k
      upperBound:
        cpu: 7931m
        memory: 8291500k
...

Now that you have the recommended CPU and memory requests, you might choose to delete your Deployment, add CPU and memory requests to your Deployment manifest, and start your Deployment again.

Updating resource requests automatically

In this exercise, you create a Deployment that has two Pods. Each Pod has one container that requests 100 milliCPU and 50 mebibytes of memory. Then you create a VerticalPodAutoscaler that automatically adjusts the CPU and memory requests.

Here is a manifest for the Deployment:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: my-auto-deployment
spec:
  replicas: 2
  template:
    metadata:
      app: my-auto-deployment
    spec:
      containers:
      - name: my-container
        image: k8s.gcr.io/ubuntu-slim:0.1
        resources:
          requests:
            cpu: 100m
            memory: 50Mi
        command: ["/bin/sh"]
        args: ["-c", "while true; do timeout 0.5s yes >/dev/null; sleep 0.5s; done"]

Copy the manifest to a file named my-deployment.yaml, and create the Deployment:

kubectl create -f my-deployment.yaml

List the running Pods:

kubectl get pods

The output shows the names of the Pods in my-deployment:

NAME                            READY     STATUS             RESTARTS   AGE
my-deployment-cbcdd49fb-d6bf9   1/1       Running            0          8s
my-deployment-cbcdd49fb-th288   1/1       Running            0          8s

Make a note of your Pod names for later.

The CPU and memory requests for the Deployment are very small, so it's likely that the Deployment would benefit from an increase in resources.

Here is a manifest for a VerticalPodAutoscaler:

apiVersion: autoscaling.k8s.io/v1beta2
kind: VerticalPodAutoscaler
metadata:
  name: my-vpa
spec:
  targetRef:
    apiVersion: "extensions/v1beta1"
    kind:       Deployment
    name:       my-deployment
  updatePolicy:
    updateMode: "Auto"

In the manifest, the targetRef field says any Pod that is controlled by a Deployment named my-deployment belongs to this VerticalPodAutoscaler.

The updateMode field has a value of Auto, which means that the VerticalPodAutoscaler can update CPU and memory requests during the life of a Pod. That is, the VerticalPodAutoscaler can delete a Pod, adjust the CPU and memory requests, and then start a new Pod.

Copy the manifest to a file named my-vpa.yaml, and create the VerticalPodAutoscaler:

kubectl create -f my-vpa.yaml

Wait a few minutes, and view the running Pods again:

kubectl get pods

Notice that the Pod names have changed. If the Pod names have not yet changed, wait a bit longer, and list the running Pods again.

Get detailed information about one of your running Pods:

kubectl get pod [POD_NAME] --output yaml

where [POD_NAME] is the name of one of your Pods.

In the output, you can see that the VerticalPodAutoscaler has increased the memory and CPU requests. You can also see an annotation that documents the update:

apiVersion: v1
kind: Pod
metadata:
  annotations:
    vpaUpdates: 'Pod resources updated by my-vpa: container 0: cpu capped to node
      capacity, memory capped to node capacity, cpu request, memory request'
...
spec:
  containers:
  ...
    resources:
      requests:
        cpu: 510m
        memory: 262144k
    ...

Get detailed information about the VerticalPodAutoscaler:

kubectl get vpa my-vpa --output yaml

The output shows three sets of recommendations for CPU and memory requests: lower bound, target, and upper bound:

...
  recommendation:
    containerRecommendations:
    - containerName: my-container
      lowerBound:
        cpu: 536m
        memory: 262144k
      target:
        cpu: 587m
        memory: 262144k
      upperBound:
        cpu: 27854m
        memory: "545693548"

The target recommendation says that the container will run optimally if it requests 587 milliCPU and 262144 kilobytes of memory.

The VerticalPodAutoscaler uses the lowerBound and upperBound recommendations to decide whether to delete a Pod and replace it with a new Pod. If a Pod has requests less than the lower bound or greater than the upper bound, the VerticalPodAutoscaler deletes the Pod and replaces it with a Pod that has the target recommendation.

What's next

Was this page helpful? Let us know how we did:

Send feedback about...

Kubernetes Engine Documentation