This page explains how you can analyze and adjust the CPU requests and memory requests of a container in a Google Kubernetes Engine (GKE) cluster using vertical Pod autoscaling.
You can scale container resources manually through the Google Cloud console,
analyze resources using a VerticalPodAutoscaler
object, or
configure automatic scaling using
vertical Pod autoscaling.
Before you begin
Before you start, make sure you have performed the following tasks:
- Enable the Google Kubernetes Engine API. Enable Google Kubernetes Engine API
- If you want to use the Google Cloud CLI for this task,
install and then
initialize the
gcloud CLI. If you previously installed the gcloud CLI, get the latest
version by running
gcloud components update
.
Analyze resource requests
The Vertical Pod Autoscaler automatically analyzes your containers and provides suggested resource requests. You can view these resource requests using the Google Cloud console, Cloud Monitoring, or Google Cloud CLI.
Console
To view suggested resource requests in the Google Cloud console, you must have an existing workload deployed that is at least 24 hours old. Some suggestions might not be available or relevant for certain workloads, such as those created within the last 24 hours, standalone Pods, and apps written in Java.
Go to the Workloads page in the Google Cloud console.
In the workloads list, click the name of the workload you want to scale.
Click list Actions > Scale > Edit resource requests.
The Analyze resource utilization data section shows historic usage data that the Vertical Pod Autoscaler controller analyzed to create the suggested resource requests in the Adjust resource requests and limits section.
Cloud Monitoring
To view suggested resource requests in Cloud Monitoring, you must have an existing workload deployed.
Go to the Metrics Explorer page in the Google Cloud console.
Click Configuration.
Expand the Select a Metric menu.
In the Resource menu, select Kubernetes Scale.
In the Metric category menu, select Autoscaler.
In the Metric menu, select Recommended per replicate request bytes and Recommended per replica request core.
Click Apply.
gcloud CLI
To view suggested resource requests, you must create a VerticalPodAutoscaler
object and a Deployment.
For Standard clusters, enable vertical Pod autoscaling for your cluster. For Autopilot clusters, vertical Pod autoscaling is enabled by default.
gcloud container clusters update CLUSTER_NAME --enable-vertical-pod-autoscaling
Replace
CLUSTER_NAME
with the name of your cluster.Save the following manifest as
my-rec-deployment.yaml
:apiVersion: apps/v1 kind: Deployment metadata: name: my-rec-deployment spec: replicas: 2 selector: matchLabels: app: my-rec-deployment template: metadata: labels: app: my-rec-deployment spec: containers: - name: my-rec-container image: nginx
This manifest describes a
Deployment
that does not have CPU or memory requests. Thecontainers.name
value ofmy-rec-deployment
specifies that all Pods in the Deployment belong to theVerticalPodAutoscaler
.Apply the manifest to the cluster:
kubectl create -f my-rec-deployment.yaml
Save the following manifest as
my-rec-vpa.yaml
:apiVersion: autoscaling.k8s.io/v1 kind: VerticalPodAutoscaler metadata: name: my-rec-vpa spec: targetRef: apiVersion: "apps/v1" kind: Deployment name: my-rec-deployment updatePolicy: updateMode: "Off"
This manifest describes a
VerticalPodAutoscaler
. TheupdateMode
value ofOff
means that when Pods are created, the Vertical Pod Autoscaler controller analyzes the CPU and memory needs of the containers and records those recommendations in thestatus
field of the resource. The Vertical Pod Autoscaler controller does not automatically update the resource requests for running containers.Apply the manifest to the cluster:
kubectl create -f my-rec-vpa.yaml
After some time, view the
VerticalPodAutoscaler
:kubectl get vpa my-rec-vpa --output yaml
The output is similar to the following:
... recommendation: containerRecommendations: - containerName: my-rec-container lowerBound: cpu: 25m memory: 262144k target: cpu: 25m memory: 262144k upperBound: cpu: 7931m memory: 8291500k ...
This output shows recommendations for CPU and memory requests.
Set Pod resource requests manually
You can set Pod resource requests manually using the Google Cloud CLI or the Google Cloud console.
Console
Go to the Workloads page in the Google Cloud console.
In the workloads list, click the name of the workload you want to scale.
Click list Actions > Scale > Edit resource requests.
- The Adjust resource requests and limits section shows the current CPU and memory requests for each container as well as suggested CPU and memory requests.
Click Apply Latest Suggestions to view suggested requests for each container.
Click Save Changes.
Click Confirm.
gcloud
To set resource requests for a Pod, set the requests.cpu and memory.cpu values in your Deployment manifest. In this example, you manually modify the Deployment created in Analyze resource requests with suggested resource requests.
Save the following example manifest as
my-adjusted-deployment.yaml
:apiVersion: apps/v1 kind: Deployment metadata: name: my-rec-deployment spec: replicas: 2 selector: matchLabels: app: my-rec-deployment template: metadata: labels: app: my-rec-deployment spec: containers: - name: my-rec-container image: nginx resources: requests: cpu: 25m memory: 256Mi
This manifest describes a Deployment that has two Pods. Each Pod has one container that requests 25 milliCPU and 256 MiB of memory.
Apply the manifest to the cluster:
kubectl apply -f my-adjusted-deployment.yaml
You can also apply changes manually by performing the following steps:
Go to the Workloads page in the Google Cloud console.
In the workloads list, click the name of the workload you want to scale.
Click list Actions > Scale > Edit resource requests.
Configure your container requests.
Click Get Equivalent YAML.
Click Download Workload or copy and paste the manifest into a file named
resource-adjusted.yaml
.Apply the manifest to your cluster:
kubectl create -f resource-adjusted.yaml
Set Pod resource requests automatically
Vertical Pod autoscaling uses the VerticalPodAutoscaler
object to
automatically set resource requests on Pods when the updateMode
is Auto
. You
can configure a VerticalPodAutoscaler
using the gcloud CLI or the
Google Cloud console.
Console
To set resource requests automatically, you must have a cluster with the vertical Pod autoscaling feature enabled. Autopilot clusters have the vertical Pod autoscaling feature enabled by default.
Enable Vertical Pod Autoscaling
Go to the Google Kubernetes Engine page in Google Cloud console.
In the cluster list, click the name of the cluster you want to modify.
In the Automation section, click edit Edit for the Vertical Pod Autoscaling option.
Select the Enable Vertical Pod Autoscaling checkbox.
Click Save changes.
Configure Vertical Pod Autoscaling
Go to the Workloads page in Google Cloud console.
In the workloads list, click the name of the Deployment you want to configure vertical Pod autoscaling for.
Click list Actions > Autoscale > Vertical pod autoscaling.
Choose an autoscaling mode:
- Auto mode: Vertical Pod autoscaling updates CPU and memory requests during the life of a Pod.
- Initial mode: Vertical Pod autoscaling assigns resource requests only at Pod creation and never changes them later.
(Optional) Set container policies. This option lets you ensure that the recommendation is never set above or below a specified resource request.
- Click expand_more Add Policy.
- Select Auto for Edit container mode.
- In Controlled resources, select which resources you want to autoscale the container on.
- Click Add Rule to set one or more minimum or maximum ranges for the
container's resource requests:
- Min. allowed Memory: the minimum amount of memory that the container should always have, in MiB.
- Min. allowed CPU: the minimum amount of CPU that the container should always have, in mCPU.
- Max allowed Memory: the maximum amount of memory that the container should always have, in MiB.
- Max allowed CPU: the maximum amount of CPU that the container should always have, in mCPU.
Click Done.
Click Save.
gcloud
To set resource requests automatically, you must use a cluster that has the vertical Pod autoscaling feature enabled. Autopilot clusters have the feature enabled by default.
For Standard clusters, enable vertical Pod autoscaling for your cluster:
gcloud container clusters update CLUSTER_NAME --enable-vertical-pod-autoscaling
Replace
CLUSTER_NAME
with the name of your cluster.Save the following manifest as
my-auto-deployment.yaml
:apiVersion: apps/v1 kind: Deployment metadata: name: my-auto-deployment spec: replicas: 2 selector: matchLabels: app: my-auto-deployment template: metadata: labels: app: my-auto-deployment spec: containers: - name: my-container image: registry.k8s.io/ubuntu-slim:0.1 resources: requests: cpu: 100m memory: 50Mi command: ["/bin/sh"] args: ["-c", "while true; do timeout 0.5s yes >/dev/null; sleep 0.5s; done"]
This manifest describes a Deployment that has two Pods. Each Pod has one container that requests 100 milliCPU and 50 MiB of memory.
Apply the manifest to the cluster:
kubectl create -f my-auto-deployment.yaml
List the running Pods:
kubectl get pods
The output shows the names of the Pods in
my-deployment
:NAME READY STATUS RESTARTS AGE my-auto-deployment-cbcdd49fb-d6bf9 1/1 Running 0 8s my-auto-deployment-cbcdd49fb-th288 1/1 Running 0 8s
Save the following manifest as
my-vpa.yaml
:apiVersion: autoscaling.k8s.io/v1 kind: VerticalPodAutoscaler metadata: name: my-vpa spec: targetRef: apiVersion: "apps/v1" kind: Deployment name: my-auto-deployment updatePolicy: updateMode: "Auto"
This manifest describes a
VerticalPodAutoscaler
with the following properties:targetRef.name
: specifies that any Pod that is controlled by a Deployment namedmy-deployment
belongs to thisVerticalPodAutoscaler
.updateMode: Auto
: specifies that the Vertical Pod Autoscaler controller can delete a Pod, adjust the CPU and memory requests, and then start a new Pod.
You can also configure vertical Pod autoscaling to assign resource requests only at Pod creation time, using
updateMode: "Initial"
.Apply the manifest to the cluster:
kubectl create -f my-vpa.yaml
Wait a few minutes, and view the running Pods again:
kubectl get pods
The output shows that the Pod names have changed:
NAME READY STATUS RESTARTS AGE my-auto-deployment-89dc45f48-5bzqp 1/1 Running 0 8s my-auto-deployment-89dc45f48-scm66 1/1 Running 0 8s
If the Pod names have not changed, wait a bit longer, and then view the running Pods again.
View information about a Vertical Pod Autoscaler
To view details about a Vertical Pod Autoscaler, do the following:
Get detailed information about one of your running Pods:
kubectl get pod POD_NAME --output yaml
Replace
POD_NAME
with the name of one of your Pods that you retrieved in the previous step.The output is similar to the following:
apiVersion: v1 kind: Pod metadata: annotations: vpaUpdates: 'Pod resources updated by my-vpa: container 0: cpu capped to node capacity, memory capped to node capacity, cpu request, memory request' ... spec: containers: ... resources: requests: cpu: 510m memory: 262144k ...
This output shows that the Vertical Pod Autoscaler controller has a memory request of 262144k and a CPU request of 510 milliCPU.
Get detailed information about the
VerticalPodAutoscaler
:kubectl get vpa my-vpa --output yaml
The output is similar to the following:
... recommendation: containerRecommendations: - containerName: my-container lowerBound: cpu: 536m memory: 262144k target: cpu: 587m memory: 262144k upperBound: cpu: 27854m memory: "545693548"
This output shows recommendations for CPU and memory requests and includes the following properties:
target
: specifies that for the container to run optimally, it should request 587 milliCPU and 26,2144 kilobytes of memory.lowerBound
andupperBound
: vertical Pod autoscaling uses these properties to decide whether to delete a Pod and replace it with a new Pod. If a Pod has requests less than the lower bound or greater than the upper bound, the Vertical Pod Autoscaler deletes the Pod and replaces it with a Pod that meets the target attribute.
Opt out specific containers
You can opt out specific containers from vertical Pod autoscaling using the gcloud CLI or the Google Cloud console.
Console
To opt out specific containers from vertical Pod autoscaling, you must have a cluster with the vertical Pod autoscaling feature enabled. Autopilot clusters have the vertical Pod autoscaling feature enabled by default.
Enable Vertical Pod Autoscaling
Go to the Google Kubernetes Engine page in Google Cloud console.
In the cluster list, click the name of the cluster you want to modify.
In the Automation section, click edit Edit for the Vertical Pod Autoscaling option.
Select the Enable Vertical Pod Autoscaling checkbox.
Click Save changes.
Configure Vertical Pod Autoscaling
Go to the Workloads page in Google Cloud console.
In the workloads list, click the name of the Deployment you want to configure vertical Pod autoscaling for.
Click list Actions > Autoscale > Vertical pod autoscaling.
Choose an autoscaling mode:
- Auto mode: Vertical Pod autoscaling updates CPU and memory requests during the life of a Pod.
- Initial mode: Vertical Pod autoscaling assigns resource requests only at Pod creation and never changes them later.
Click expand_more Add Policy.
Select the container you want to opt out.
For Edit container mode, select Off.
Click Done.
Click Save.
gcloud
To opt out specific containers from vertical Pod autoscaling, perform the following steps:
Save the following manifest as
my-opt-vpa.yaml
:apiVersion: autoscaling.k8s.io/v1 kind: VerticalPodAutoscaler metadata: name: my-opt-vpa spec: targetRef: apiVersion: "apps/v1" kind: Deployment name: my-opt-deployment updatePolicy: updateMode: "Auto" resourcePolicy: containerPolicies: - containerName: my-opt-sidecar mode: "Off"
This manifest describes a
VerticalPodAutoscaler
. Themode: "Off"
value turns off recommendations for the containermy-opt-sidecar
.Apply the manifest to the cluster:
kubectl apply -f my-opt-vpa.yaml
Save the following manifest as
my-opt-deployment.yaml
:apiVersion: apps/v1 kind: Deployment metadata: name: my-opt-deployment spec: replicas: 1 selector: matchLabels: app: my-opt-deployment template: metadata: labels: app: my-opt-deployment spec: containers: - name: my-opt-container image: nginx - name: my-opt-sidecar image: busybox command: ["sh","-c","while true; do echo Doing sidecar stuff!; sleep 60; done"]
Apply the manifest to the cluster:
kubectl apply -f my-opt-deployment.yaml
After some time, view the Vertical Pod Autoscaler:
kubectl get vpa my-opt-vpa --output yaml
The output shows recommendations for CPU and memory requests:
... recommendation: containerRecommendations: - containerName: my-opt-container ...
In this output, there are only recommendations for one container. There are no recommendations for
my-opt-sidecar
.The Vertical Pod Autoscaler never updates resources on opted out containers. If you wait a few minutes, the Pod recreates but only one container has updated resource requests.
What's next
- Learn more about Vertical Pod autoscaling.
- Learn best practices for running cost-optimized Kubernetes applications on GKE.
- Learn how to Assign CPU Resources to containers and Pods.
- Learn how to Assign memory resources to containers and Pods.