This page provides information about visibility events emitted by cluster autoscaler in Google Kubernetes Engine (GKE). By analyzing these events, you can gain insights into how cluster autoscaler manages your cluster's scaling and understand the reasons behind its decisions.
The GKE cluster autoscaler emits visibility events, which are available as log entries in Cloud Logging. The events described in this guide are separate from the Kubernetes events produced by the cluster autoscaler.
Requirements
To see autoscaler events, you must enable Cloud Logging in your cluster. The events won't be produced if Logging is disabled.
Viewing events
The visibility events for the cluster autoscaler are stored in a Cloud Logging log, in the same project as where your GKE cluster is located. You can also view these events from the notifications in the Google Kubernetes Engine page in Google Cloud console.
Viewing visibility event logs
To view the logs, perform the following:
In the Google Cloud console, go to the Kubernetes Clusters page.
Select the name of your cluster to view its Cluster Details page.
On the Cluster Details page, click the Logs tab.
On the Logs tab, click the Autoscaler Logs tab to view the logs.
(Optional) To apply more advanced filters to narrow the results, click the button with the arrow on the right side of the page to view the logs in Logs Explorer.
Viewing visibility event notifications
To view the visibility event notifications on the Google Kubernetes Engine page, perform the following:
Go to the Google Kubernetes Engine page in the Google Cloud console:
Check the Notifications column for specific clusters to find notifications related to scaling.
Click the notification for detailed information, recommended actions, and to access the logs for this event.
Types of events
All logged events are in the JSON format and can be found in the jsonPayload field of a log entry. All timestamps in the events are UNIX second timestamps.
Here's a summary of the types of events emitted by the cluster autoscaler:
Event type | Description |
---|---|
status |
Occurs periodically and describes the size of all autoscaled node pools and the target size of all autoscaled node pools as observed by the cluster autoscaler. |
scaleUp |
Occurs when cluster autoscaler scales the cluster up. |
scaleDown |
Occurs when cluster autoscaler scales the cluster down. |
eventResult |
Occurs when a scaleUp or a scaleDown event completes successfully or unsuccessfully. |
nodePoolCreated |
Occurs when cluster autoscaler with node auto-provisioning enabled creates a new node pool. |
nodePoolDeleted |
Occurs when cluster autoscaler with node auto-provisioning enabled deletes a node pool. |
noScaleUp |
Occurs when there are unschedulable Pods in the cluster, and cluster autoscaler cannot scale the cluster up to accommodate the Pods. |
noScaleDown |
Occurs when there are nodes that are blocked from being deleted by cluster autoscaler. |
Status event
A status
event is emitted periodically, and describes the actual size of all
autoscaled node pools and the target size of all autoscaled node pools as
observed by cluster autoscaler.
Example
The following log sample shows a status
event:
{
"status": {
"autoscaledNodesCount": 4,
"autoscaledNodesTarget": 4,
"measureTime": "1582898536"
}
}
ScaleUp event
A scaleUp
event is emitted when the cluster autoscaler scales the cluster up.
The autoscaler increases the size of the cluster's node pools by scaling up the
underlying Managed Instance Groups (MIGs) for
the node pools. To learn more about how scale up works, see
How does scale up work?
in the Kubernetes Cluster Autoscaler FAQ.
The event contains information on which MIGs were scaled up, by how many nodes, and which unschedulable Pods triggered the event.
The list of triggering Pods is truncated to 50 arbitrary entries. The
actual number of triggering Pods can be found in the triggeringPodsTotalCount
field.
Example
The following log sample shows a scaleUp
event:
{
"decision": {
"decideTime": "1582124907",
"eventId": "ed5cb16d-b06f-457c-a46d-f75dcca1f1ee",
"scaleUp": {
"increasedMigs": [
{
"mig": {
"name": "test-cluster-default-pool-a0c72690-grp",
"nodepool": "default-pool",
"zone": "us-central1-c"
},
"requestedNodes": 1
}
],
"triggeringPods": [
{
"controller": {
"apiVersion": "apps/v1",
"kind": "ReplicaSet",
"name": "test-85958b848b"
},
"name": "test-85958b848b-ptc7n",
"namespace": "default"
}
],
"triggeringPodsTotalCount": 1
}
}
}
ScaleDown event
A scaleDown
event is emitted when cluster autoscaler scales the cluster down.
To learn more about how scale down works, see
How does scale down work?
in the Kubernetes Cluster Autoscaler FAQ.
The cpuRatio
and memRatio
fields describe the CPU and memory utilization of
the node, as a percentage. This utilization is a sum of Pod requests divided by
node allocatable, not real utilization.
The list of evicted Pods is truncated to 50 arbitrary entries. The actual
number of evicted Pods can be found in the evictedPodsTotalCount
field.
Use the following query to verify if the cluster autoscaler scaled down the nodes:
resource.type="k8s_cluster" \
resource.labels.location=COMPUTE_REGION \
resource.labels.cluster_name=CLUSTER_NAME \
log_id("container.googleapis.com/cluster-autoscaler-visibility") \
( "decision" NOT "noDecisionStatus" )
Replace the following:
CLUSTER_NAME
: the name of the cluster.COMPUTE_REGION
: the cluster's Compute Engine region, such asus-central1
.
Example
The following log sample shows a scaleDown
event:
{
"decision": {
"decideTime": "1580594665",
"eventId": "340dac18-8152-46ff-b79a-747f70854c81",
"scaleDown": {
"nodesToBeRemoved": [
{
"evictedPods": [
{
"controller": {
"apiVersion": "apps/v1",
"kind": "ReplicaSet",
"name": "kube-dns-5c44c7b6b6"
},
"name": "kube-dns-5c44c7b6b6-xvpbk"
}
],
"evictedPodsTotalCount": 1,
"node": {
"cpuRatio": 23,
"memRatio": 5,
"mig": {
"name": "test-cluster-default-pool-c47ef39f-grp",
"nodepool": "default-pool",
"zone": "us-central1-f"
},
"name": "test-cluster-default-pool-c47ef39f-p395"
}
}
]
}
}
}
You can also view the scale-down
event on the nodes with no workload running
(typically only system pods created by DaemonSets).
Use the following query to see the event logs:
resource.type="k8s_cluster" \
resource.labels.project_id=PROJECT_ID \
resource.labels.location=COMPUTE_REGION \
resource.labels.cluster_name=CLUSTER_NAME \
severity>=DEFAULT \
logName="projects/PROJECT_ID/logs/events" \
("Scale-down: removing empty node")
Replace the following:
PROJECT_ID
: your project ID.CLUSTER_NAME
: the name of the cluster.COMPUTE_REGION
: the cluster's Compute Engine region, such asus-central1
.
EventResult event
An eventResult
event is emitted when a scaleUp or a scaleDown event
completes successfully or unsuccessfully. This event contains a list of event IDs
(from the eventId
field in scaleUp or scaleDown events), along with error
messages. An empty error message indicates the event completed successfully. A
list of eventResult events are aggregated in the results
field.
To diagnose errors, consult the ScaleUp errors and ScaleDown errors sections.
Example
The following log sample shows an eventResult
event:
{
"resultInfo": {
"measureTime": "1582878896",
"results": [
{
"eventId": "2fca91cd-7345-47fc-9770-838e05e28b17"
},
{
"errorMsg": {
"messageId": "scale.down.error.failed.to.delete.node.min.size.reached",
"parameters": [
"test-cluster-default-pool-5c90f485-nk80"
]
},
"eventId": "ea2e964c-49b8-4cd7-8fa9-fefb0827f9a6"
}
]
}
}
NodePoolCreated event
A nodePoolCreated
event is emitted when cluster autoscaler with node auto-provisioning
enabled creates a new node pool. This event contains the name of the created
node pool and a list of its underlying MIGs. If the node pool was created because of a
scaleUp event, the eventId
of the corresponding scaleUp event is included in
the triggeringScaleUpId
field.
Example
The following log sample shows a nodePoolCreated
event:
{
"decision": {
"decideTime": "1585838544",
"eventId": "822d272c-f4f3-44cf-9326-9cad79c58718",
"nodePoolCreated": {
"nodePools": [
{
"migs": [
{
"name": "test-cluster-nap-n1-standard--b4fcc348-grp",
"nodepool": "nap-n1-standard-1-1kwag2qv",
"zone": "us-central1-f"
},
{
"name": "test-cluster-nap-n1-standard--jfla8215-grp",
"nodepool": "nap-n1-standard-1-1kwag2qv",
"zone": "us-central1-c"
}
],
"name": "nap-n1-standard-1-1kwag2qv"
}
],
"triggeringScaleUpId": "d25e0e6e-25e3-4755-98eb-49b38e54a728"
}
}
}
NodePoolDeleted event
A nodePoolDeleted
event is emitted when cluster autoscaler with
node auto-provisioning
enabled deletes a node pool.
Example
The following log sample shows a nodePoolDeleted
event:
{
"decision": {
"decideTime": "1585830461",
"eventId": "68b0d1c7-b684-4542-bc19-f030922fb820",
"nodePoolDeleted": {
"nodePoolNames": [
"nap-n1-highcpu-8-ydj4ewil"
]
}
}
}
NoScaleUp event
A noScaleUp
event is periodically emitted when there are unschedulable Pods in
the cluster and cluster autoscaler cannot scale the cluster up to accommodate
the Pods.
- noScaleUp events are best-effort, that is, these events don't cover all possible reasons for why cluster autoscaler cannot scale up.
- noScaleUp events are throttled to limit the produced log volume. Each persisting reason is only emitted every couple of minutes.
- All the reasons can be arbitrarily split across multiple events. For example, there is no guarantee that all rejected MIG reasons for a single Pod group will appear in the same event.
- The list of unhandled Pod groups is truncated to 50 arbitrary entries. The
actual number of unhandled Pod groups can be found in the
unhandledPodGroupsTotalCount
field.
Reason fields
The following fields help to explain why scaling up did not occur:
reason
: Provides a global reason for why cluster autoscaler is prevented from scaling up. Refer to the NoScaleUp top-level reasons section for details.napFailureReason
: Provides a global reason preventing cluster autoscaler from provisioning additional node pools (for example, node auto-provisioning is disabled). Refer to the NoScaleUp top-level node auto-provisioning reasons section for details.skippedMigs[].reason
: Provides information about why a particular MIG was skipped. Cluster autoscaler skips some MIGs from consideration for any Pod during a scaling up attempt (for example, because adding another node would exceed cluster-wide resource limits). Refer to the NoScaleUp MIG-level reasons section for details.unhandledPodGroups
: Contains information about why a particular group of unschedulable Pods does not trigger scaling up. The Pods are grouped by their immediate controller. Pods without a controller are in groups by themselves. Each Pod group contains an arbitrary example Pod and the number of Pods in the group, as well as the following reasons:napFailureReasons
: Reasons why cluster autoscaler cannot provision a new node pool to accommodate this Pod group (for example, Pods have affinity constraints). Refer to the NoScaleUp Pod-level node auto-provisioning reasons section for details.rejectedMigs[].reason
: Per-MIG reasons why cluster autoscaler cannot increase the size of a particular MIG to accommodate this Pod group (for example, the MIG's node is too small for the Pods). Refer to the NoScaleUp MIG-level reasons section for details.
Example
The following log sample shows a noScaleUp
event:
{
"noDecisionStatus": {
"measureTime": "1582523362",
"noScaleUp": {
"skippedMigs": [
{
"mig": {
"name": "test-cluster-nap-n1-highmem-4-fbdca585-grp",
"nodepool": "nap-n1-highmem-4-1cywzhvf",
"zone": "us-central1-f"
},
"reason": {
"messageId": "no.scale.up.mig.skipped",
"parameters": [
"max cluster cpu limit reached"
]
}
}
],
"unhandledPodGroups": [
{
"napFailureReasons": [
{
"messageId": "no.scale.up.nap.pod.zonal.resources.exceeded",
"parameters": [
"us-central1-f"
]
}
],
"podGroup": {
"samplePod": {
"controller": {
"apiVersion": "v1",
"kind": "ReplicationController",
"name": "memory-reservation2"
},
"name": "memory-reservation2-6zg8m",
"namespace": "autoscaling-1661"
},
"totalPodCount": 1
},
"rejectedMigs": [
{
"mig": {
"name": "test-cluster-default-pool-b1808ff9-grp",
"nodepool": "default-pool",
"zone": "us-central1-f"
},
"reason": {
"messageId": "no.scale.up.mig.failing.predicate",
"parameters": [
"NodeResourcesFit",
"Insufficient memory"
]
}
}
]
}
],
"unhandledPodGroupsTotalCount": 1
}
}
}
NoScaleDown event
A noScaleDown
event is periodically emitted when there are nodes which are
blocked from being deleted by cluster autoscaler.
- Nodes that cannot be removed because their utilization is high are not included in noScaleDown events.
- NoScaleDown events are best effort, that is, these events don't cover all possible reasons for why cluster autoscaler cannot scale down.
- NoScaleDown events are throttled to limit the produced log volume. Each persisting reason will only be emitted every couple of minutes.
- The list of nodes is truncated to 50 arbitrary entries. The actual number of
nodes can be found in the
nodesTotalCount
field.
Reason fields
The following fields help to explain why scaling down did not occur:
reason
: Provides a global reason for why cluster autoscaler is prevented from scaling down (for example, a backoff period after recently scaling up). Refer to the NoScaleDown top-level reasons section for details.nodes[].reason
: Provides per-node reasons for why cluster autoscaler is prevented from deleting a particular node (for example, there's no place to move the node's Pods to). Refer to the NoScaleDown node-level reasons section for details.
Example
The following log sample shows a noScaleDown
event:
{
"noDecisionStatus": {
"measureTime": "1582858723",
"noScaleDown": {
"nodes": [
{
"node": {
"cpuRatio": 42,
"mig": {
"name": "test-cluster-default-pool-f74c1617-grp",
"nodepool": "default-pool",
"zone": "us-central1-c"
},
"name": "test-cluster-default-pool-f74c1617-fbhk"
},
"reason": {
"messageId": "no.scale.down.node.no.place.to.move.pods"
}
}
],
"nodesTotalCount": 1,
"reason": {
"messageId": "no.scale.down.in.backoff"
}
}
}
}
Messages
The events emitted by the cluster autoscaler use parameterized messages to
provide explanations for the event. The parameters
field is available
with the messageId
field, such as in this example log for a NoScaleUp event.
This section provides descriptions for various messageId
and its corresponding
parameters. However, this section does not contain all possible messages, and
may be extended at any time.
ScaleUp errors
You can find event error messages for scaleUp
events in the corresponding
eventResult
event, in the resultInfo.results[].errorMsg
field.
Message | Details | Parameters | Mitigation |
---|---|---|---|
"scale.up.error.out.of.resources" |
Resource errors occur when you try to request new resources in a zone that cannot accommodate your request due to the current unavailability of a Compute Engine resource, such as GPUs or CPUs. | Failing MIG IDs. | Follow the resource availability troubleshooting steps in the Compute Engine documentation. |
"scale.up.error.quota.exceeded" |
The scaleUp event failed because some of the MIGs couldn't be increased, due to exceeded Compute Engine quota. | Failing MIG IDs. | Check the Errors tab of the MIG in the Google Cloud console to see what quota is being exceeded. After you know which quota is being exceeded, follow the instructions to request a quota increase. |
"scale.up.error.waiting.for.instances.timeout" |
Scale up of managed instance group failed to scale up due to timeout. | Failing MIG IDs. | This message should be transient. If it persists, contact Cloud Customer Care for further investigation. |
"scale.up.error.ip.space.exhausted" |
Can't scale up because instances in some of the managed instance groups ran out of IPs. This means that the cluster doesn't have enough unallocated IP address space to use to add new nodes or Pods. | Failing MIG IDs. | Follow the troubleshooting steps in Not enough free IP address space for Pods. |
"scale.up.error.service.account.deleted" |
Can't scale up because the service account was deleted. | Failing MIG IDs. | Try to undelete the service account. If that procedure is unsuccessful, contact Cloud Customer Care for further investigation. |
Reasons for a noScaleUp event
A noScaleUp
event is periodically emitted when there are unschedulable Pods
in the cluster and cluster autoscaler cannot scale the cluster up to schedule
the Pods. noScaleUp
events are best-effort, and don't cover all possible cases.
NoScaleUp top-level reasons
Top-level reason messages for noScaleUp
events appear in the
noDecisionStatus.noScaleUp.reason
field. The message contains a top-level
reason for why cluster autoscaler cannot scale the cluster up.
Message | Details | Mitigation |
---|---|---|
"no.scale.up.in.backoff" |
No scale up because scaling up is in a backoff period (temporarily blocked). This message that can occur during scale up events with a large number of Pods. | This message should be transient. Check this error after a few minutes. If this message persists, contact Cloud Customer Care for further investigation. |
NoScaleUp top-level node auto-provisioning reasons
Top-level node auto-provisioning reason messages for noScaleUp
events appear
in the noDecisionStatus.noScaleUp.napFailureReason
field. The message contains
a top-level reason for why cluster autoscaler cannot provision new node pools.
Message | Details | Mitigation |
---|---|---|
"no.scale.up.nap.disabled" |
Node auto provisioning couldn't scale up because node auto provisioning is not enabled at cluster level. If node auto-provisioning is disabled, new nodes won't be automatically provisioned if the pending Pod has requirements that can't be satisfied by any existing node pools. |
Review the cluster configuration and consider enabling node auto-provisioning. |
NoScaleUp MIG-level reasons
MIG-level reason messages for noScaleUp
events appear in the
noDecisionStatus.noScaleUp.skippedMigs[].reason
and
noDecisionStatus.noScaleUp.unhandledPodGroups[].rejectedMigs[].reason
fields.
The message contains a reason why cluster autoscaler can't increase the size of
a particular MIG.
Message | Details | Parameters | Mitigation |
---|---|---|---|
"no.scale.up.mig.skipped" |
Cannot scale up a MIG because it was skipped during the simulation. | Reasons why the MIG was skipped (for example, missing a Pod requirement). | Review the parameters included in the error message and address why the MIG was skipped. |
"no.scale.up.mig.failing.predicate" |
Can't scale up a node pool because of a failing scheduling predicate for the pending Pods. | Name of the failing predicate and reasons why it failed. | Review the Pod requirements, such as affinity rules, taints or tolerations, and resource requirements. |
NoScaleUp Pod-group-level node auto-provisioning reasons
Pod-group-level node auto-provisioning reason messages for noScaleUp
events
appear in the
noDecisionStatus.noScaleUp.unhandledPodGroups[].napFailureReasons[]
field. The
message contains a reason why cluster autoscaler cannot provision a new node
pool to schedule a particular Pod group.
Message | Details | Parameters | Mitigation |
---|---|---|---|
"no.scale.up.nap.pod.gpu.no.limit.defined" |
Node auto-provisioning couldn't provision any node group because a pending Pod has a GPU request, but GPU resource limits are not defined at the cluster level. | Requested GPU type. | Review the pending Pod's GPU request, and update the cluster-level node auto-provisioning configuration for GPU limits. |
"no.scale.up.nap.pod.gpu.type.not.supported" |
Node auto-provisioning did not provision any node group for the Pod because it has requests for an unknown GPU type. | Requested GPU type. | Check the pending Pod's configuration for the GPU type to ensure that it matches a supported GPU type. |
"no.scale.up.nap.pod.zonal.resources.exceeded" |
Node auto-provisioning did not provision any node group for the Pod in this zone because doing so would either violate the cluster-wide maximum resource limits, exceed the available resources in the zone, or there is no machine type that could fit the request. | Name of the considered zone. | Review and update cluster-wide maximum resource limits, the Pod resource requests, or the available zones for node auto-provisioning. |
"no.scale.up.nap.pod.zonal.failing.predicates" |
Node auto-provisioning did not provision any node group for the Pod in this zone because of failing predicates. | Name of the considered zone and reasons why predicates failed. | Review the pending Pod's requirements, such as affinity rules, taints, tolerations, or resource requirements. |
ScaleDown errors
You can find error event messages for scaleDown
events in the corresponding
eventResult
event, in the resultInfo.results[].errorMsg
field.
Event message | Details | Parameter | Mitigation |
---|---|---|---|
"scale.down.error.failed.to.mark.to.be.deleted" |
A node couldn't be marked for deletion. | Failing node name. | This message should be transient. If it persists, contact Cloud Customer Care for further investigation. |
"scale.down.error.failed.to.evict.pods" |
Cluster autoscaler can't scale down because some of the Pods couldn't be evicted from a node. | Failing node name. | Review the PodDisruptionBudget for the Pod and make sure the rules allow for eviction of application replicas when acceptable. To learn more, see Specifying a Disruption Budget for your Application in the Kubernetes documentation. |
"scale.down.error.failed.to.delete.node.min.size.reached" |
Cluster autoscaler can't scale down because a node couldn't be deleted due to the cluster already being at minimal size. | Failing node name. | Review the minimum value set for node pool autoscaling and adjust the settings as necessary. To learn more, see the Error: Nodes in the cluster have reached minimum size. |
Reasons for a noScaleDown event
A noScaleDown
event is periodically emitted when there are nodes which are
blocked from being deleted by cluster autoscaler. noScaleDown
events are
best-effort, and don't cover all possible cases.
NoScaleDown top-level reasons
Top-level reason messages for noScaleDown
events appear in the
noDecisionStatus.noScaleDown.reason
field. The message contains a top-level
reason why cluster autoscaler can't scale the cluster down.
Event message | Details | Mitigation |
---|---|---|
"no.scale.down.in.backoff" |
Cluster autoscaler can't scale down because scaling down is in a backoff period (temporarily blocked). | This message should be transient, and can occur when there has been a recent scale up event. If the message persists, contact Cloud Customer Care for further investigation. |
"no.scale.down.in.progress" |
Cluster autoscaler can't scale down because a previous scale down was still in progress. |
This message should be transient, as the Pod will eventually be removed. If this message occurs frequently, review the termination grace period for the Pods blocking scale down. To speed up the resolution, you can also delete the Pod if it's no longer needed. |
NoScaleDown node-level reasons
Node-level reason messages for noScaleDown
events appear in the
noDecisionStatus.noScaleDown.nodes[].reason field
. The message contains a
reason why cluster autoscaler can't remove a particular node.
Event message | Details | Parameters | Mitigation |
---|---|---|---|
"no.scale.down.node.scale.down.disabled.annotation" |
Cluster autoscaler can't remove a node from the node pool because
the node is annotated with
cluster-autoscaler.kubernetes.io/scale-down-disabled: true .
|
N/A | Cluster autoscaler skips nodes with this annotation without considering their utilization and this message is logged regardless of the node's utilization factor. If you want cluster autoscaler to scale down these nodes, remove the annotation. |
"no.scale.down.node.node.group.min.size.reached" |
Cluster autoscaler can't scale down when node group size has exceeded minimum size limit. This happens because removing nodes would violate the cluster-wide minimal resource limits defined in your node auto-provisioning settings. |
N/A | Review the minimum value set for node pool autoscaling. If you want cluster autoscaler to scale down this node, adjust the minimum value. |
"no.scale.down.node.minimal.resource.limits.exceeded" |
Cluster autoscaler can't scale down nodes because it would violate cluster-wide minimal resource limits. These are the resource limits set for node auto-provisioning. |
N/A | Review your limits for memory and vCPU and, if you want cluster autoscaler to scale down this node, increase the limits. |
"no.scale.down.node.no.place.to.move.pods" |
Cluster autoscaler can't scale down because there's no place to move Pods. | N/A | If you expect that the Pod should be rescheduled, review the scheduling requirements of the Pods on the underutilized node to determine if they can be moved to another node in the cluster. To learn more, see the Error: No place to move Pods. |
"no.scale.down.node.pod.not.backed.by.controller" |
Pod is blocking scale down because it's not backed by a controller. Specifically, the cluster autoscaler is unable to scale down an underutilized node due to a Pod that lacks a recognized controller. Allowable controllers include ReplicationController, DaemonSet, Job, StatefulSet, or ReplicaSet. |
Name of the blocking Pod. | Set the annotation
"cluster-autoscaler.kubernetes.io/safe-to-evict": "true"
for the Pod or define an acceptable controller. |
"no.scale.down.node.pod.has.local.storage" |
Pod is blocking scale down because it has local storage. | Name of the blocking Pod. | Set an annotation
"cluster-autoscaler.kubernetes.io/safe-to-evict": "true"
for the Pod if the data in the local storage for the Pod is not critical.
This error only occurs for clusters using a version earlier than 1.22. |
"no.scale.down.node.pod.not.safe.to.evict.annotation" |
A Pod on the node has the safe-to-evict=false annotation. |
Name of the blocking Pod. | If the Pod can be safely evicted, edit the manifest of the Pod and
update the annotation to
"cluster-autoscaler.kubernetes.io/safe-to-evict": "true" . |
"no.scale.down.node.pod.kube.system.unmovable" |
Pod is blocking scale down because it's a non-DaemonSet,
non-mirrored, Pod without a PodDisruptionBudget in the
kube-system namespace. |
Name of the blocking Pod. | By default, Pods in the To resolve this issue, either add a PodDisruptionBudget
for the |
"no.scale.down.node.pod.not.enough.pdb" |
Pod is blocking scale down because it doesn't have enough PodDisruptionBudget. | Name of the blocking Pod. | Review the PodDisruptionBudget for the Pod and consider making it less restrictive. To learn more, see Error: Not enough PodDisruptionBudget. |
"no.scale.down.node.pod.controller.not.found" |
Pod is blocking scale down because its controller (for example, a Deployment or ReplicaSet) can't be found. | N/A | To determine what actions were taken that left the Pod running after its controller was removed, review the logs. To resolve this issue, manually delete the Pod. |
"no.scale.down.node.pod.unexpected.error" |
Pod is blocking scale down because of an unexpected error. | N/A | The root cause of this error is unknown. Contact Cloud Customer Care for further investigation. |
What's next
- Learn more about cluster autoscaler.
- Learn about how to use node auto-provisioning.
- Troubleshoot cluster autoscaler not scaling down.
- Troubleshoot cluster autoscaler not scaling up.
- Watch a YouTube video about troubleshooting and resolving scaling issues.