This page provides an overview of vertical pod autoscaling and explains how you can use it to adjust CPU requests and limits and memory requests and limits for containers. This page also provides reference material for the VerticalPodAutoscaler custom resource and related types.
Overview
Vertical pod autoscaling (VPA) frees you from having to think about what values to specify for a container's CPU and memory requests. The autoscaler can recommend values for CPU and memory requests and limits, or it can automatically update the values.
Vertical pod autoscaling provides these benefits:
Cluster nodes are used efficiently, because Pods use exactly what they need.
Pods are scheduled onto nodes that have the appropriate resources available.
You don't have to run time-consuming benchmarking tasks to determine the correct values for CPU and memory requests.
Maintenance time is reduced, because the autoscaler can adjust CPU and memory requests over time without any action on your part.
Limitations for Vertical Pod Autoscaling:
Vertical Pod Autoscaling supports a maximum of 500 VPA objects per cluster.
Vertical Pod Autoscaling is supported on regional clusters starting from version 1.12.6.
Do not use Vertical Pod Autoscaling with Horizontal Pod Autoscaling (HPA) on CPU or memory. However, you can use VPA with HPA on custom and external metrics.
Vertical Pod Autoscaler is not yet ready for use with Java workloads due to limited visibility into actual memory usage of the workload.
Vertical Pod Autoscaling in Auto mode
Due to Kubernetes limitations, the only way to modify the resource requests of a
running Pod is to recreate the Pod. If you create a VerticalPodAutoscaler
with an updateMode
of "Auto", the VerticalPodAutoscaler evicts a Pod if it
needs to change the Pod's resource requests.
To limit the amount of Pod restarts, use a pod disruption budget.
To make sure that your cluster can handle the new sizes of your workloads, use Cluster Autoscaler and Node Autoprovisioning. Vertical Pod Autoscaler notifies Cluster Autoscaler ahead of the update, and the resources needed for the resized workload are provided before recreating it, to minimize the disruption time.
What's next
- How to Configure Vertical Pod Autoscaling
- Assign CPU Resources to Containers and Pods
- Assign Memory Resources to Containers and Pods
- Scaling an Application
- Autoscaling Deployments with Custom Metrics
- Cluster Autoscaler
Reference
VerticalPodAutoscaler v1 autoscaling.k8s.io
Fields | |
---|---|
|
API group, version, and kind. |
metadata |
Standard object metadata. |
spec |
The desired behavior of the VerticalPodAutoscaler. |
status |
The most recently observed status of the VerticalPodAutoscaler. |
VerticalPodAutoscalerSpec v1 autoscaling.k8s.io
Fields | |
---|---|
targetRef |
Reference to the controller that manages the set of Pods for the autoscaler to control, for example, a Deployment or a StatefulSet. You can point a VerticalPodAutoscaler at any controller that has a Scale subresource. Typically, the VerticalPodAutoscaler retrieves the Pod set from the controller's ScaleStatus. For some well known controllers, for example DaemonSet, the VerticalPodAutoscaler retrieves the Pod set from the controller's spec. |
updatePolicy |
Specifies whether recommended updates are applied when a Pod is started and whether recommended updates are applied during the life of a Pod. |
resourcePolicy |
Specifies policies for how CPU and memory requests are adjusted for individual containers. |
VerticalPodAutoscalerList v1 autoscaling.k8s.io
Fields | |
---|---|
|
API group, version, and kind. |
metadata |
Standard object metadata. |
items |
A list of VerticalPodAutoscaler objects. |
PodUpdatePolicy v1 autoscaling.k8s.io
Fields | |
---|---|
updateMode |
Specifies whether recommended updates are applied when a Pod is started and whether recommended updates are applied during the life of a Pod. Possible values are "Off", "Initial", "Recreate", and "Auto". |
PodResourcePolicy v1 autoscaling.k8s.io
Fields | |
---|---|
containerPolicies |
An array of resource policies for individual containers. |
ContainerResourcePolicy v1 autoscaling.k8s.io
Fields | |
---|---|
containerName |
The name of the container that the policy applies to. If not specified, the policy serves as the default policy. |
mode |
Specifies whether recommended updates are applied to the container when it is started and whether recommended updates are applied during the life of the container. Possible values are "Off" and "Auto". |
minAllowed |
Specifies the minimum CPU request and memory request allowed for the container. |
maxAllowed |
Specifies the maximum CPU request and memory request allowed for the container. |
VerticalPodAutoscalerStatus v1 autoscaling.k8s.io
Fields | |
---|---|
recommendation |
The most recently recommended CPU and memory requests. |
conditions |
Describes the current state of the VerticalPodAutoscaler. |
RecommendedPodResources v1 autoscaling.k8s.io
Fields | |
---|---|
containerRecommendation |
An array of resource recommendations for individual containers. |
RecommendedContainerResources v1 autoscaling.k8s.io
Fields | |
---|---|
containerName |
The name of the container that the recommendation applies to. |
target |
The recommended CPU request and memory request for the container. |
lowerBound |
The minimum recommended CPU request and memory request for the container. This amount is not guaranteed to be sufficient for the application to be stable. Running with smaller CPU and memory requests is likely to have a significant impact on performance or availability. |
upperBound |
The maximum recommended CPU request and memory request for the container. CPU and memory requests higher than these values are likely to be wasted. |
uncappedTarget |
The most recent resource recommendation computed by the autoscaler, based on actual resource usage, not taking into account the ContainerResourcePolicy. If actual resource usage causes the target to violate the ContainerResourcePolicy, this might be different from the bounded recommendation. This field does not affect actual resource assignment. It is used only as a status indication. |
VerticalPodAutoscalerCondition v1 autoscaling.k8s.io
Fields | |
---|---|
type |
The type of condition being described. Possible values are "RecommendationProvided", "LowConfidence", "NoPodsMatched", and "FetchingHistory". |
status |
The status of the condition. Possible values are True, False, and Unknown. |
lastTransitionTime |
The last time the condition made a transition from one status to another. |
reason |
The reason for the last transition from one status to another. |
message |
A human-readable string that gives details about the last transition from one status to another. |