This page lists known issues for Knative serving. For known security vulnerabilities, see Security best practices.
You can also check for existing issues or open new issues in the public issue trackers.
Also see the troubleshooting page for troubleshooting strategies as well as solutions for some common errors.
Services stuck in RevisionMissing
due to missing MutatingWebhookConfiguration
Creation of a new service or a new service revision may become stuck in the state "RevisionMissing" due to a missing webhook configuration. You can confirm this using the command
kubectl get mutatingwebhookconfiguration webhook.serving.knative.dev
which returns
kmutatingwebhookconfigurations.admissionregistration.k8s.io "webhook.serving.knative.dev" not found`
Temporary workaround
Until this is fixed in an upcoming version, you can do the following to fix this issue:
Restart the webhook Pod to recreate the
MutatingWebhookConfiguration
:kubectl delete pod -n knative-serving -lapp=webhook kubectl get mutatingwebhookconfiguration --watch
Restart the controllers:
kubectl delete pod -n gke-system -listio=pilot kubectl delete pod -n knative-serving -lapp=controller
Deploy a new revision for each service that has the
RevisionMissing
issue:gcloud run services update SERVICE --update-labels client.knative.dev/nonce=""
replacing SERVICE with the name of the service.
Repeat the above steps as needed if you experience the same issue when you deploy new revisions of the service.
Zonal clusters
When using a zonal cluster with Knative serving, access to the control plane is unavailable during cluster maintenance.
During this period, Knative serving may not work as expected. Services deployed in that cluster
- Are not shown in the Cloud console or via gcloud CLI
- Cannot be deleted or updated
- Will not automatically scale instances, but existing instances will continue to serve new requests
To avoid these issues, you can use a regional cluster, which ensures a high availability control plane.
Default memory limit is not enforced through command line
If you use the command line to deploy your services, you must include the
--memory
flag to set a memory limit for that service. Excluding the --memory
flag allows a service to consume up to the total amount of available memory
on the node where that pod is running, which might have unexpected side effects.
When deploying through the Google Cloud console, the default value of 256M
is used
unless a different value is specified.
To avoid having to define default limits for each service, you can choose to define a default memory limit for the namespace where you deploy those services. For more information, see Configuring default memory limits in the Kubernetes documentation.
Default CPU limit is not enabled
When deploying using the command line or Console, the amount of CPU a service can use is not defined. This allows the service to consume all available CPU in the node where it is running, which may have unexpected side effects.
You can workaround this by defining a default CPU limit for the namespace where you are deploying services with Knative serving. For more information see Configuring default CPU limits in the Kubernetes documentation.
Note: By default, services deployed with Knative serving request
400m
CPU, which is used to schedule instances of a service on the cluster nodes.
Deploying private container images in Artifact Registry
There is a known deployment issue that is caused by an authentication failure between Knative serving and Artifact Registry when private container images are deployed. To avoid issues when deploying private images in Artifact Registry you can either:
Use the image digest of your private container images to deploy a service.
Create and use an
imagePullSecret
. Using animagePullSecret
allows you to use the image tag of your private container images. For details, see Deploying private container images from other container registries.
Configuration errors on clusters upgraded to version 0.20.0-gke.6
Clusters that are upgraded to version 0.20.0-gke.6
can receive one of
the following errors.
When updating that cluster's configmap, the cluster can receive the following error:
Error from server (InternalError): error when replacing "/tmp/file.yaml":
Internal error occurred: failed calling webhook "config.webhook.istio.networking.internal.knative.dev":
the server rejected our request for an unknown reason
If the pods fail to start because of a queue proxy failure, the cluster can receive the following error:
Startup probe failed: flag provided but not defined: -probe-timeout
To resolve these errors, you must run the following command to remove the
validatingwebhookconfiguration
configuration that is no longer supported in
0.20.0
:
kubectl delete validatingwebhookconfiguration config.webhook.istio.networking.internal.knative.dev
After removing the unsupported configuration, you can proceed with updating your cluster's configmap.
Missing metrics after upgrading to Knative serving 0.23.0-gke.9
Issue: The following metrics are missing after upgrading your cluster version to
0.23.0-gke.9
: Request count
, Request latencies
and Container instance count
Possible cause: The Metric Collector
is disabled.
To determine if the Metric Collector
is preventing your metrics from being
collected:
Ensure that your version of Knative serving is
0.23.0-gke.9
by runnging the following command:kubectl get deployment controller -n knative-serving -o jsonpath='{.metadata.labels.serving\.knative\.dev/release}'
Check if
Metric Collector
is disable by running the following command:kubectl get cloudrun cloud-run -n cloud-run-system -o jsonpath='{.spec.metricscollector}'
Your
Metric Collector
is disabled if the result is not{enabled: true}
.To enable
Metric Collector
, run one of the following commands:If the result is empty, run:
kubectl patch cloudrun cloud-run -n cloud-run-system --type='json' -p='[{"op": "test", "path": "/spec", "value": NULL}, {"op": "add", "path": "/spec", "value": {}}]' kubectl patch cloudrun cloud-run -n cloud-run-system --type='json' -p='[{"op": "test", "path": "/spec/metricscollector", "value": NULL}, {"op": "add", "path": "/spec/metricscollector", "value": {}}]' kubectl patch cloudrun cloud-run -n cloud-run-system --type='json' -p='[{"op": "add", "path": "/spec/metricscollector/enabled", "value": true}]'
If the result is
{enabled: false}
, run:kubectl patch cloudrun cloud-run -n cloud-run-system --type='json' -p='[{"op": "replace", "path": "/spec/metricscollector/enabled", "value": true}]'
Verify that
Metric Collector
is enabled by running the following command:kubectl get cloudrun cloud-run -n cloud-run-system -o jsonpath='{.spec.metricscollector}'