When horizontal Pod autoscaling doesn't function as you expect in
Google Kubernetes Engine (GKE), your workloads might not scale correctly. This issue
can prevent applications from handling load, which can cause performance issues
or outages. You might see Pods not increasing despite high CPU, metric values
showing as <unknown>
in your HorizontalPodAutoscaler status, or scaling
operations not occurring at all.
Use this page to diagnose and resolve common issues with horizontal Pod autoscaling, ranging from initial misconfigurations in your HorizontalPodAutoscaler objects to more complex failures within the metrics pipeline. By following these troubleshooting steps, you can help ensure that your applications scale efficiently and reliably based on demand, making effective use of the Horizontal Pod Autoscaler resource.
This information is important for Application developers who configure HorizontalPodAutoscaler objects and need to ensure their applications scale correctly. It also helps Platform admins and operators troubleshoot issues with the metrics pipeline or cluster configuration that affect all autoscaled workloads. For more information about the common roles and example tasks that we reference in Google Cloud content, see Common GKE user roles and tasks.
If you've already experienced symptoms or seen an error message, use the following table to find the right guidance:
Symptom | Possible resolution |
---|---|
No scaling, but HorizontalPodAutoscaler conditions are True |
Troubleshoot a healthy but unresponsive HorizontalPodAutoscaler |
You see a specific error message in the HorizontalPodAutoscaler events | Troubleshoot common Horizontal Pod Autoscaler errors |
Metric <unknown> |
Troubleshoot custom and external metrics |
Not scaling down | Troubleshoot Horizontal Pod Autoscaler failing to scale down |
Before you begin
- Make sure you use HorizontalPodAutoscaler objects with scalable workloads, such as Deployments and StatefulSets. You cannot use horizontal Pod autoscaling with workloads that cannot scale, for example, DaemonSets.
-
To get the permissions that you need to troubleshoot horizontal Pod autoscaling in GKE, which includes inspecting HorizontalPodAutoscaler objects and viewing cluster logs, ask your administrator to grant you the following IAM roles on your project:
-
Inspect GKE resources:
GKE Viewer (
roles/container.viewer
) -
View cluster logs:
Logs Viewer (
roles/logging.viewer
)
For more information about granting roles, see Manage access to projects, folders, and organizations.
You might also be able to get the required permissions through custom roles or other predefined roles.
-
Inspect GKE resources:
GKE Viewer (
Configure the
kubectl
command-line tool to communicate with your GKE cluster:gcloud container clusters get-credentials CLUSTER_NAME \ --location LOCATION \ --project PROJECT_ID
Replace the following:
CLUSTER_NAME
: the name of your cluster.LOCATION
: the Compute Engine region or zone (for example,us-central1
orus-central1-a
) for the cluster.PROJECT_ID
: your Google Cloud project ID.
Verify Horizontal Pod Autoscaler status and configuration
Start your troubleshooting by inspecting the HorizontalPodAutoscaler object's health and configuration. This initial check helps you identify and resolve basic misconfigurations, which are a common root cause of scaling problems.
Describe the HorizontalPodAutoscaler
To view the HorizontalPodAutoscaler's real-time calculations and recent scaling
decisions, use the kubectl describe hpa
command. This command provides a
summary of the HorizontalPodAutoscaler object and an Events
log that is
helpful for diagnosing issues:
kubectl describe hpa HPA_NAME -n NAMESPACE_NAME
Replace the following:
HPA_NAME
: the name of your HorizontalPodAutoscaler object.NAMESPACE_NAME
: the namespace of your HorizontalPodAutoscaler object.
The output is similar to the following:
Name: php-apache-hpa
Reference: Deployment/php-apache
Metrics: ( current / target )
resource cpu on pods (as a percentage of request): 1% (1m) / 50%
Min replicas: 1
Max replicas: 10
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True ReadyForNewScale recommended size matches current size
ScalingActive True ValidMetricFound the HorizontalPodAutoscaler was able to successfully calculate a replica count
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulRescale 39m horizontal-pod-autoscaler New size: 4; reason: cpu resource utilization...
Normal SuccessfulRescale 26m horizontal-pod-autoscaler New size: 1; reason: cpu resource utilization...
In the output, the following three sections help you diagnose the issue:
Metrics
: this section displays current metric values compared to their targets. Check here to see if the HorizontalPodAutoscaler is receiving data. An<unknown>
metric value indicates that the HorizontalPodAutoscaler hasn't fetched the metric or that the metrics pipeline is broken.Conditions
: this high-level health check shows whether the HorizontalPodAutoscaler can fetch metrics (AbleToScale
) and perform scaling calculations (ScalingActive
). AFalse
status in any of these conditions indicates a failure.Events
: this section logs recent scaling actions, warnings, and errors from the HorizontalPodAutoscaler controller. It's often the first place to find specific error messages or reasons, such asFailedGetScale
orFailedGetResourceMetric
, which help you to discover the source of the problem.
Check the HorizontalPodAutoscaler status in Deployments
To check the status of the HorizontalPodAutoscaler objects used with your Deployments, use the Google Cloud console:
In the Google Cloud console, go to the Workloads page.
Click the name of your Deployment.
Go to the Details tab and find the Autoscaling section.
Review the value in the Status row:
- A green checkmark means the HorizontalPodAutoscaler is configured and can read its metrics.
- An amber triangle means the HorizontalPodAutoscaler is configured, but is having trouble reading its metrics. This is a common issue with custom or external metrics. To resolve this issue, diagnose why the metrics are unavailable. For more information, see the Troubleshoot custom and external metrics section.
For other workload types such as StatefulSets, or for more detail, check the HorizontalPodAutoscaler object's manifest.
Check your HorizontalPodAutoscaler's manifest
The YAML manifest of your HorizontalPodAutoscaler object lets you view information about its configuration and its current state.
To view the YAML manifest, select one of the following options:
Console
In the Google Cloud console, go to the Object Browser page.
In the Object Kinds list, select the HorizontalPodAutoscaler checkbox and click OK.
Navigate to the autoscaling API group, then click the expander arrow for HorizontalPodAutoscaler.
Click the name of the HorizontalPodAutoscaler object that you want to inspect.
Review the YAML section, which displays the complete configuration of the HorizontalPodAutoscaler object.
kubectl
Run the following command:
kubectl get hpa HPA_NAME -n NAMESPACE_NAME -o yaml
Replace the following:
HPA_NAME
: the name of your HorizontalPodAutoscaler object.NAMESPACE_NAME
: the namespace of your HorizontalPodAutoscaler object.
After you retrieve the manifest, look for these key sections:
spec
(your configuration):scaleTargetRef
: the workload (such as a Deployment) that the HorizontalPodAutoscaler is supposed to scale.minReplicas
andmaxReplicas
: the minimum and maximum replica settings.metrics
: the metrics that you configured for scaling (for example, CPU utilization or custom metrics).
status
(the HorizontalPodAutoscaler's live state):currentMetrics
: the most recent metric values that the HorizontalPodAutoscaler has observed.currentReplicas
anddesiredReplicas
: the current number of Pods and the number that the HorizontalPodAutoscaler wants to scale to.conditions
: the most valuable section for troubleshooting. This section shows the HorizontalPodAutoscaler's health:AbleToScale
: indicates if the HorizontalPodAutoscaler can find its target and metrics.ScalingActive
: shows if the HorizontalPodAutoscaler is allowed to calculate and perform scaling.ScalingLimited
: shows if the HorizontalPodAutoscaler wants to scale but is being capped by yourminReplicas
ormaxReplicas
settings.
Use advanced logging features
To gain deeper insight into your HorizontalPodAutoscaler object, use the following types of logs:
View Horizontal Pod Autoscaler events in Cloud Logging: use a log filter to find all Horizontal Pod Autoscaler events for a specific cluster. For example:
In the Google Cloud console, go to the Logs Explorer page.
In the query pane, enter the following query:
resource.type="k8s_cluster" resource.labels.cluster_name="CLUSTER_NAME" resource.labels.location="LOCATION" logName="projects/PROJECT_ID/logs/events" jsonPayload.involvedObject.kind="HorizontalPodAutoscaler"`
Replace the following:
CLUSTER_NAME
: the name of the cluster that the HorizontalPodAutoscaler belongs to.LOCATION
: the Compute Engine region or zone (for example,us-central1
orus-central1-a
) for the cluster.PROJECT_ID
: your project ID.
Click Run query and review the output.
View Horizontal Pod Autoscaler events: these logs provide structured, human-readable logs explaining how the HorizontalPodAutoscaler computes a recommendation, offering detailed insight into its decision-making process.
Troubleshoot a healthy but unresponsive HorizontalPodAutoscaler
This section helps you diagnose why your HorizontalPodAutoscaler might not be triggering any scaling actions, even when it appears healthy and reports no errors in its status or events.
Symptoms:
The HorizontalPodAutoscaler appears healthy, its conditions report True
, and
it shows no errors in its events. However, it still doesn't take any scaling
actions.
Cause:
Several factors can cause this expected behavior:
- Replica limits: the current number of replicas is already at the
boundary set by the
minReplicas
ormaxReplicas
field in the HorizontalPodAutoscaler configuration. - Tolerance window: Kubernetes uses a default 10% tolerance window to prevent scaling on minor metric fluctuations. Scaling only occurs if the ratio of the current metric to the target metric falls outside the 0.9 to 1.1 range. For example, if the target is 85% CPU and the current usage is 93%, the ratio is approximately 1.094 (93/85≈1.094). Because this value is less than 1.1, the Horizontal Pod Autoscaler doesn't scale up.
- Unready Pods: the Horizontal Pod Autoscaler includes only Pods with a
Ready
status in its scaling calculations. Pods that are stuck with aPending
status or are not becomingReady
(due to failing health checks or resource issues) are ignored and can prevent scaling. - Sync period delay: the HorizontalPodAutoscaler controller checks metrics periodically. A delay of 15-30 seconds between a metric crossing the threshold and the initiation of a scaling action is normal.
- New metric latency: when a HorizontalPodAutoscaler uses a new custom metric for the first time, you might see a one-time latency of several minutes. This delay occurs because the monitoring system (such as Cloud Monitoring) must create the new time series when the first data point is written.
- Multiple metrics calculation: when you configure multiple metrics, the Horizontal Pod Autoscaler calculates the required replica count for each metric independently and then chooses the highest calculated value as the final number of replicas. Because of this behavior, your workload scales to meet the demands of the metric with the highest needs. For example, if CPU metrics calculate a need for 9 replicas, but a requests-per-second metric calculates a need for 15, the Horizontal Pod Autoscaler scales the Deployment to 15 replicas.
Resolution:
Try the following solutions:
- Replica limits: check the
minReplicas
andmaxReplicas
values in your HorizontalPodAutoscaler manifest or in the output of thekubectl describe
command. Adjust these limits if they are preventing necessary scaling. - Tolerance window: if scaling is required inside the default tolerance, configure a different tolerance value. Otherwise, wait for the metric to move outside the 0.9 to 1.1 ratio.
- Unready Pods: investigate why Pods are
Pending
or notReady
and resolve the underlying issues (for example, resource constraints, failing readiness probes). For troubleshooting tips, see Debug Pods in the Kubernetes documentation. - Sync period delay and new metric latency: these latencies are normal. Wait for the sync period to complete or for the new custom metric time series to be created.
- Multiple metrics calculation: this is the intended behavior. If scale-up is occurring based on one metric (such as requests-per-second), it is correctly overriding another metric's lower calculation (such as CPU).
Troubleshoot common Horizontal Pod Autoscaler errors
The following sections provide solutions for specific error messages and event
reasons that you might encounter when inspecting your HorizontalPodAutoscaler's
status. You typically find these messages in the Events
section of the output
of the kubectl describe hpa
command.
Troubleshoot Horizontal Pod Autoscaler configuration errors
A misconfiguration in the HorizontalPodAutoscaler manifest, such as a mistyped field or conflicting configurations, causes the errors in this section.
Error: invalid metrics
You might see this error when the configuration for a metric within a HorizontalPodAutoscaler is syntactically incorrect or inconsistent.
Symptoms:
If the HorizontalPodAutoscaler can't calculate the required replicas due to a
configuration problem, its Events
section shows a reason of
FailedComputeMetricsReplicas
with a message similar to the following:
invalid metrics (1 invalid out of 1)
Cause:
This error usually means there's a mismatch between the metric type
and the
target
that you defined in your HorizontalPodAutoscaler manifest. For example,
you might have specified a type
of Utilization
, but provided a target value
of averageValue
instead of averageUtilization
.
Resolution:
Correct the HorizontalPodAutoscaler manifest so that the value of the target
field aligns with the metric type
:
- If
type
isUtilization
, the value in thetarget
field must beaverageUtilization
. - If
type
isAverageValue
, the value in thetarget
field must beaverageValue
.
Error: multiple services selecting the same target
You might see this error when using traffic-based autoscaling that has an incorrect Service configuration for your HorizontalPodAutoscaler.
Symptoms:
You notice the following error:
multiple services selecting the same target of HPA_NAME: SERVICE_NAME
This output includes the following values:
HPA_NAME
: the name of the HorizontalPodAutoscaler.SERVICE_NAME
: the name of a Service.
Cause:
Traffic-based autoscaling is configured, but more than one Kubernetes Service is
targeting the HorizontalPodAutoscaler's scaleTargetRef
field. Traffic-based
autoscaling supports only a one-to-one relationship between the Service and the
autoscaled workload.
Resolution:
To fix this issue, make sure that only one Service's label selector matches your workload's Pods:
Find your workload's Pod labels:
kubectl get deployment HPA_TARGET_DEPLOYMENT \ -n NAMESPACE \ -o jsonpath='{.spec.template.metadata.labels}'
Replace the following:
HPA_TARGET_DEPLOYMENT
: the name of the Deployment that the HorizontalPodAutoscaler is targeting.NAMESPACE
: the namespace of the Deployment.
The output is similar to the following:
{"app":"my-app", "env":"prod"}
Find all Services that match those labels by reviewing the
spec.selector
field for all Services in the namespace.kubectl get services -n NAMESPACE -o yaml
Identify every Service whose selector matches the labels from the previous step. For example, both
{"app": "my-app"}
and{"app": "my-app", "env": "prod"}
would match the example Pod labels.Resolve the conflict by choosing one of the following options:
- Make the intended Service's selector unique by adding a new, unique label
to your Deployment's
spec.template.metadata.labels
field. Then, update the one intended Service'sspec.selector
field to include this new label. - Make other Service selectors more restrictive by changing the
spec.selector
field of all other conflicting Services so they are more restrictive and no longer match your workload's Pods.
- Make the intended Service's selector unique by adding a new, unique label
to your Deployment's
Apply your changes:
kubectl apply -f MANIFEST_NAME
Replace
MANIFEST_NAME
with the name of the YAML file containing the updated Service or Deployment manifest.
Error: label is not allowed
Symptoms:
You notice the following error:
unable to fetch metrics from external metrics API: googleapi: Error 400: Metric label: 'LABEL_NAME' is not allowed
In this output, LABEL_NAME
is the name of the incorrect
label.
Cause:
The HorizontalPodAutoscaler manifest specifies an invalid label key in the
metric.selector.matchLabels
section and Cloud Monitoring doesn't recognize
or allow this key for the metric.
Resolution:
To resolve this issue, do the following:
- Identify the disallowed label name from the error message.
- Remove or correct this label key in the
metric.selector.matchLabels
section of your HorizontalPodAutoscaler manifest. - Find a valid, filterable label key by consulting the Cloud Monitoring documentation for that metric.
Issue: Multiple HorizontalPodAutoscalers targeting the same workload
Configuring multiple HorizontalPodAutoscaler objects to manage the same workload causes conflicting and unpredictable scaling behavior.
Symptoms:
There is no specific Condition
or Reason
within a HorizontalPodAutoscaler's
status that directly indicates this conflict. Instead, you might observe the
following symptoms:
- The replica count of the workload can fluctuate unexpectedly.
- Scaling decisions might not seem to correspond to the metrics defined in any single HorizontalPodAutoscaler.
- When you view events, you might see alternating or contradictory
SuccessfulRescale
events from different HorizontalPodAutoscaler objects.
Cause:
This issue occurs because more than one HorizontalPodAutoscaler object within
the same namespace specifies the exact same workload in the
spec.scaleTargetRef
field. Each HorizontalPodAutoscaler independently
calculates replica counts and attempts to scale the workload based on its own
set of metrics and targets. Kubernetes doesn't block this configuration, but it
leads to erratic scaling adjustments because the HorizontalPodAutoscalers
compete with each other.
Resolution:
To avoid conflicts, define all scaling metrics in a single
HorizontalPodAutoscaler object. Each HorizontalPodAutoscaler calculates scaling
needs from its own spec.metrics
field, so merging them lets the chosen
HorizontalPodAutoscaler object consider all factors, such as CPU and
requests-per-second, together:
To identify which HorizontalPodAutoscalers target the same workload, get the YAML manifest for each HorizontalPodAutoscaler object. Pay close attention to the
spec.scaleTargetRef
field in the output.kubectl get hpa -n NAMESPACE_NAME -o yaml
Replace
NAMESPACE_NAME
with the namespace of your HorizontalPodAutoscaler object.Look for any instances where different HorizontalPodAutoscaler resources have the same values for
apiVersion
,kind
, andname
within theirscaleTargetRef
field.Consolidate metrics into a single HorizontalPodAutoscaler object:
- Choose one HorizontalPodAutoscaler object to keep. This HorizontalPodAutoscaler will be the one that you modify.
- Examine the
spec.metrics
section in the manifest of each of the other HorizontalPodAutoscaler objects that target the same workload. - Copy the metric definitions that you want to keep from the
spec.metrics
sections of the duplicate HorizontalPodAutoscaler objects. - Paste these copied metric definitions into the
spec.metrics
array of the HorizontalPodAutoscaler that you decided to keep.
Apply your changes:
kubectl apply -f MANIFEST_NAME
Replace
MANIFEST_NAME
with the name of the HorizontalPodAutoscaler manifest that you decided to keep.Delete the other HorizontalPodAutoscaler objects that were targeting the same workload:
kubectl delete hpa DUPLICATE_MANIFEST_NAME -n NAMESPACE_NAME
Replace
DUPLICATE_MANIFEST_NAME
with the name of the redundant HorizontalPodAutoscaler object that you want to delete.
Troubleshoot workload and target errors
The errors in this section are caused by the Deployment, StatefulSet, or Pods being scaled, not the HorizontalPodAutoscaler object itself.
Error: Unable to get the target's current scale
You might see this error when the HorizontalPodAutoscaler cannot locate or access the workload it's supposed to be scaling.
Symptoms:
The Events
section has a condition of FailedGetScale
with a message
similar to the following:
the HorizontalPodAutoscaler controller was unable to get the target's current scale: WORKLOAD_TYPE.apps "TARGET_WORKLOAD" not found
This output includes the following values:
WORKLOAD_TYPE
: the type of workload, such asDeployment
orStatefulSet
.TARGET_WORKLOAD
: the name of the workload.
Cause:
The HorizontalPodAutoscaler controller is unable to find the workload (such as a
Deployment or StatefulSet) that it's configured to manage. This issue is
caused by an issue in the scaleTargetRef
field within the
HorizontalPodAutoscaler's manifest. The specified resource might not exist,
might have been deleted, or might have a spelling mistake.
Resolution:
Try the following solutions:
- Verify the
scaleTargetRef
field of the HorizontalPodAutoscaler's manifest: Make sure that the value of thename
,kind
, andapiVersion
in thescaleTargetRef
field exactly match the corresponding metadata of your target workload. If the workload name is incorrect, update the HorizontalPodAutoscaler'sscaleTargetRef
field to point to the correct name. - Confirm that the workload exists: ensure that the target workload exists
in the same namespace as the HorizontalPodAutoscaler. You can check this
with a command such as
kubectl get deployment DEPLOYMENT_NAME
. If you intentionally deleted the workload, delete the corresponding HorizontalPodAutoscaler object to clean up your cluster. If you need to re-create the workload, the HorizontalPodAutoscaler automatically finds it after it's available, and the error resolves. - Check that the HorizontalPodAutoscaler and workload are in the same
namespace: the HorizontalPodAutoscaler and its target workload must be in
the same namespace. If you forget to specify a namespace when creating an
object with
kubectl
commands, Kubernetes places the object in thedefault
namespace. This behavior can cause a mismatch if your HorizontalPodAutoscaler is in thedefault
namespace while your workload is in another, or the other way around. Check the namespace for both objects and ensure that they match.
After the HorizontalPodAutoscaler successfully locates its target, the condition
AbleToScale
becomes True
, and the message changes to: the
HorizontalPodAutoscaler controller was able to get the target's current scale
.
Error: Unable to compute the replica count
You might see this error when the HorizontalPodAutoscaler needs to calculate scaling based on resource utilization but lacks the necessary baseline information from the Pods.
Symptoms:
The ScalingActive
condition is False
with a Reason
of
FailedGetResourceMetric
. You typically also see a message similar to the
following:
the HorizontalPodAutoscaler was unable to compute the replica count
Cause:
The Horizontal Pod Autoscaler needs to calculate resource utilization as a
percentage to scale the workload, but it cannot perform this calculation because
at least one container within the Pod specification is missing a
resources.requests
definition for the corresponding resource (cpu
or
memory
).
Resolution:
To resolve this issue, update the Pod manifest within your Deployment,
StatefulSet, or other controller to include a resources.requests
field for the
resource (cpu
or memory
) that the HorizontalPodAutoscaler is trying to scale
on for all containers in the Pod. For example:
apiVersion: v1
kind: Pod
metadata:
name: example-pod
spec:
containers:
- name: example-container
...
resources:
requests:
cpu: "100m"
memory: "128Mi"
Error: unable to fetch Pod metrics for Pod
You might see this error when there's a problem retrieving the metrics required for a HorizontalPodAutoscaler to make scaling decisions. It's often related to Pod resource definitions.
Symptoms:
You observe a persistent message similar to the following:
unable to fetch pod metrics for pod
It's normal to see this message temporarily when the metrics server starts up.
Cause:
To scale based on resource utilization percentage (such as cpu
or memory
),
every container within the Pods targeted by the HorizontalPodAutoscaler object
must have the resources.requests
field defined for that specific resource.
Otherwise, the HorizontalPodAutoscaler cannot perform the calculations that it
needs to, and takes no action related to that metric.
Resolution:
If these error messages persist and you notice that Pods are not scaling for your workload, ensure you have specified resource requests for each container in your workload.
Troubleshoot metrics API and data availability errors
The following sections help you resolve errors that occur when the
HorizontalPodAutoscaler tries to fetch data from a metrics API. These issues can
range from internal cluster communication failures, where the metrics API is
unavailable, to invalid queries that the metrics provider rejects (often seen as
400
-level HTTP errors).
Error: no known available metric versions found
Symptoms:
You notice the following error:
unable to fetch metrics from custom metrics API: no known available metric versions found
Cause:
This error indicates a communication breakdown within the cluster, not a problem with the metrics source (such as Cloud Monitoring). Common causes include the following:
- The Kubernetes API server is temporarily unavailable (for example, during a cluster upgrade or control plane repair).
- The metrics adapter Pods (for example,
custom-metrics-stackdriver-adapter
) are unhealthy, not running, or aren't correctly registered with the API server.
Resolution:
This issue is often temporary. If it persists, try the following solutions:
Check the health of the Kubernetes control plane:
In the Google Cloud console, view your cluster's health and status.
Go to the Kubernetes clusters page.
Check the Status and Notifications columns for your cluster.
Click
Notifications to look for any ongoing operations such as upgrades or repairs. The API server might be briefly unavailable during these times.
Review Cloud Audit Logs for any errors related to control plane components. For information about how to view these logs, see GKE audit logging information.
Check the health and logs of the metrics adapter Pods: ensure the metrics adapter Pods are in the
Running
status and have no recent restarts:kubectl get pods -n custom-metrics,kube-system -o wide
If a Pod's status is anything other than
Running
, or it has a high restart count, investigate the Pod to find the root cause. For troubleshooting tips, see Debug Pods in the Kubernetes documentation.Verify that the metrics APIs are registered and available:
kubectl get apiservice | grep metrics.k8s.io
If the metrics APIs are healthy, the output is similar to the following:
NAME SERVICE AVAILABLE AGE v1beta1.custom.metrics.k8s.io custom-metrics/custom-metrics-stackdriver-adapter True 18d v1beta1.external.metrics.k8s.io custom-metrics/custom-metrics-stackdriver-adapter True 18d v1beta1.metrics.k8s.io kube-system/metrics-server True 18d
If the
AVAILABLE
column has a value ofFalse
, theMessage
column in the full APIService manifest might provide more details.You can view the full manifest with the following command:
kubectl get apiservice API_SERVICE_NAME -o yaml
Replace
API_SERVICE_NAME
with the name of the APIService object, such asv1beta1.custom.metrics.k8s.io
.
Error: query will not return any time series
Symptoms:
You notice the following error:
unable to fetch metrics from custom or external metrics API: googleapi: Error
400: The supplied filter [...] query will not return any time series
Cause:
The query sent to Cloud Monitoring was valid, but it returned no data. This
means no data points exist that match your filter (which is different from
finding a metric with a value of 0
). The most likely reason for this issue is
that the application or workload responsible for generating the custom metric
was not writing data to Cloud Monitoring during the time that the errors were
reported.
Resolution:
Try the following solutions:
- Verify configuration: ensure the metric names and labels in your HorizontalPodAutoscaler object precisely match those being emitted by the application.
- Check permissions: confirm the application is correctly configured with the necessary permissions and API endpoints to publish metrics to Cloud Monitoring.
- Confirm application activity: verify that the application responsible for the metric was operational and attempting to send data to Cloud Monitoring during the timeframe that the Horizontal Pod Autoscaler warnings occurred.
- Investigate for errors: check that application's logs from the same timeframe for any explicit errors related to metric emission, such as connection failures, invalid credentials, or formatting issues.
Troubleshoot custom and external metrics
When your HorizontalPodAutoscaler relies on metrics from sources other than default CPU or memory, issues can occur within the custom or external metrics pipeline. This pipeline consists of the HorizontalPodAutoscaler controller, Kubernetes metrics API server, metrics adapter, and the metric source (for example, Cloud Monitoring or Prometheus) as shown in the following diagram:
This section details how to debug this pipeline, from the metrics adapter to the metrics source.
Symptoms:
The most common symptoms of an issue in the metrics pipeline are the following:
- The metric value shows as
<unknown>
. - HorizontalPodAutoscaler events show errors such as
FailedGetExternalMetric
orFailedGetCustomMetric
.
Cause:
An issue exists within the custom or external metrics pipeline.
Resolution:
Use the following steps to help you debug the pipeline:
Check if the metrics adapter is registered and available: the metrics adapter must register itself with the main Kubernetes API server to serve metrics. This action is the most direct way to see if the adapter is running and reachable by the API server:
kubectl get apiservice | grep -E 'NAME|metrics.k8s.io'
In the output, you should see the
v1beta1.custom.metrics.k8s.io
orv1beta1.external.metrics.k8s.io
entries and a value ofTrue
in theAvailable
column. For example:NAME SERVICE AVAILABLE AGE v1beta1.metrics.k8s.io kube-system/metrics-server True 18d
If the value in the
Available
column isFalse
or missing, your adapter is likely crashed or misconfigured. Check the adapter's Pod logs in thekube-system
orcustom-metrics
namespace for errors related to permissions, network connectivity to the metric source, or messages indicating the metric couldn't be found.If the value is
True
, proceed to the next step.
Query the metrics API directly: if the adapter is available, bypass the HorizontalPodAutoscaler and ask the Kubernetes API for your metric directly. This command tests the entire pipeline, from the API server to the metrics adapter to the data source.
To query external metrics, run the following command:
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/NAMESPACE_NAME/METRIC_NAME" | jq .
To query custom Pod metrics, run the following command:
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/NAMESPACE_NAME/pods/*/METRIC_NAME" | jq .
Replace the following:
NAMESPACE_NAME
: the namespace where your Pods are running.METRIC_NAME
: the name of the custom or external metric that you're trying to query. For example,requests_per_second
orqueue_depth
.
Analyze the command output: the result of the previous commands tells you where the problem is. Choose the scenario that matches your output:
- Successful JSON response with a value: the metrics pipeline is
working correctly. The problem is likely a configuration issue in your
HorizontalPodAutoscaler manifest. Check for spelling mistakes in the
metric name or incorrect
matchLabels
. Error: Error from server (Service Unavailable)
: this error typically indicates a network connectivity problem, which is often a firewall issue in clusters that use network isolation.Identify the metrics adapter Service. It's usually in the
custom-metrics
orkube-system
namespace:kubectl get service -n custom-metrics,kube-system | grep -E 'adapter|metrics'
Find the port that the adapter is listening on:
kubectl get service ADAPTER_SERVICE -n ADAPTER_NAMESPACE -o yaml
Replace the following:
ADAPTER_SERVICE
: the name of the Kubernetes Service associated with the metrics adapter that you deployed. This Service is the Service that you found in the preceding step. This Service exposes the adapter's functionality to other parts of the cluster, including the Kubernetes API server.ADAPTER_NAMESPACE
: the namespace where the adapter service resides (for example,custom-metrics
orkube-system
).
Find the inbound firewall rules for your cluster's control plane:
gcloud compute firewall-rules list \ --filter="name~gke-CLUSTER_NAME-[0-9a-z]*-master"
Replace
CLUSTER_NAME
with the name of your cluster.Add the adapter's
targetPort
to the rule:Describe the current rule to see existing allowed ports:
gcloud compute firewall-rules describe FIREWALL_RULE_NAME
Replace
FIREWALL_RULE_NAME
with the name of the firewall rule that governs network traffic to the control plane of your Kubernetes cluster.Update the rule to add the adapter port to the list:
gcloud compute firewall-rules update FIREWALL_RULE_NAME \ --allow tcp:443,tcp:10250,tcp:ADAPTER_PORT
Replace
ADAPTER_PORT
with the network port that the metrics adapter is listening on.
Ensure Kubernetes Network Policies are not blocking traffic to the metrics adapter Pods:
kubectl get networkpolicy -n custom-metrics,kube-system
Review any policies to ensure they allow ingress traffic from the control plane or API server to the
ADAPTER_SERVICE
on theADAPTER_PORT
.
An empty list
[]
: this output means that the adapter is running, but can't retrieve the specific metric, indicating a problem with either the adapter's configuration or the metric source itself.Adapter Pod issues: inspect the logs of the metrics adapter Pod or Pods for errors related to API calls, authentication, or metric fetching. To inspect the logs, do the following:
Find the name of the adapter Pod:
kubectl get pods -n ADAPTER_NAMESPACE
View its logs:
kubectl logs ADAPTER_POD_NAME \ -n ADAPTER_NAMESPACE
Replace the following:
ADAPTER_POD_NAME
: the name of the adapter Pod that you identified in the previous step.ADAPTER_NAMESPACE
: the namespace where the adapter Pod resides (for example,custom-metrics
orkube-system
).
No data at the source: the metric might not exist in the source system. Use a monitoring tool, such as Metrics Explorer, to confirm the metric exists and has the correct name and labels.
- Successful JSON response with a value: the metrics pipeline is
working correctly. The problem is likely a configuration issue in your
HorizontalPodAutoscaler manifest. Check for spelling mistakes in the
metric name or incorrect
Troubleshoot Horizontal Pod Autoscaler failing to scale down
This section helps you understand why a HorizontalPodAutoscaler might not be scaling down your workload as expected.
Symptoms:
The HorizontalPodAutoscaler successfully scales the workload up, but fails to scale it back down even when metrics such as CPU utilization are low.
Cause:
This behavior is designed to prevent rapidly scaling up and down or scaling down based on incomplete information. The two main reasons are the following:
- Using multiple metrics: Horizontal Pod Autoscaler scales based on the metric requiring the most replicas. If you have multiple metrics, the workload won't scale down unless all metrics indicate that fewer replicas are needed. One metric demanding a high replica count prevents scale-down, even if others are low.
- Unavailable metrics: if any metric becomes unavailable (often showing as
<unknown>
), the Horizontal Pod Autoscaler conservatively refuses to scale down the workload. It cannot determine if the metric is missing because usage is truly zero or because the metrics pipeline is broken. This issue is common with rate-based custom metrics (for example,messages_per_second
), which can stop reporting data when there is no activity, causing the Horizontal Pod Autoscaler to see the metric as unavailable and halt scale-down operations. - Scale-down delay from scaling policy: the HorizontalPodAutoscaler's
behavior
field lets you configure scaling policies. The default policy for scaling down includes a stabilization window of 300 seconds (five minutes). During this window, the HorizontalPodAutoscaler won't reduce the replica count, even when metric values have fallen below the target threshold. This window prevents rapid fluctuations, but can make scale-down slower than expected.
Resolution:
Try the following solutions:
For multiple metrics and unavailable metrics, diagnose the metric that's causing issues:
kubectl describe hpa HPA_NAME -n NAMESPACE_NAME
In the output, look in the
Metrics
section for any metric with a status of<unknown>
and in theEvents
section for warnings such asFailedGetCustomMetric
orFailedGetExternalMetric
. For detailed pipeline debugging, see the Troubleshoot custom and external metrics section.For unavailable metrics, if a metric becomes unavailable during periods of low traffic (common with rate-based metrics), try one of the following solutions:
- Use gauge-based metrics instead of rate-based ones where possible. A gauge
metric, such as the total number of messages in a queue (for example,
subscription
ornum_undelivered_messages
), consistently reports a value, even if that value is0
, allowing the Horizontal Pod Autoscaler to make scaling decisions reliably. - Ensure your metric source reports zero values. If you control the custom
metric, configure it to publish a
0
during periods of inactivity instead of sending no data at all.
- Use gauge-based metrics instead of rate-based ones where possible. A gauge
metric, such as the total number of messages in a queue (for example,
For scale-down delays from scaling policy, if the default five minute stabilization window for scale-downs is too long, customize it. Inspect the
spec.behavior.scaleDown
section of your HorizontalPodAutoscaler manifest. You can lower thestabilizationWindowSeconds
to allow the autoscaler to scale down more quickly after the metrics drop. For more information about configuring these policies, see Scaling Policies in the Kubernetes documentation.
What's next
If you can't find a solution to your problem in the documentation, see Get support for further help, including advice on the following topics:
- Opening a support case by contacting Cloud Customer Care.
- Getting support from the community by
asking questions on StackOverflow
and using the
google-kubernetes-engine
tag to search for similar issues. You can also join the#kubernetes-engine
Slack channel for more community support. - Opening bugs or feature requests by using the public issue tracker.