Known Issues for Cloud Run for Anthos

This page lists known issues for Cloud Run for Anthos.

You can also check for existing issues or open new issues in the public issue trackers.

Services stuck in RevisionMissing due to missing MutatingWebhookConfiguration

Creation of a new service or a new service revision may become stuck in the state "RevisionMissing" due to a missing webhook configuration. You can confirm this using the command

kubectl get mutatingwebhookconfiguration webhook.serving.knative.dev

which returns

kmutatingwebhookconfigurations.admissionregistration.k8s.io "webhook.serving.knative.dev" not found`

Temporary workaround

Until this is fixed in an upcoming version, you can do the following to fix this issue:

  1. Restart the webhook Pod to recreate the MutatingWebhookConfiguration:

    kubectl delete pod -n knative-serving -lapp=webhook
    kubectl get mutatingwebhookconfiguration --watch
  2. Restart the controllers:

    kubectl delete pod -n gke-system -listio=pilot
    kubectl delete pod -n knative-serving -lapp=controller
  3. Deploy a new revision for each service that has the RevisionMissing issue:

    gcloud run services update SERVICE --update-labels client.knative.dev/nonce=""

    replacing SERVICE with the name of the service.

  4. Repeat the above steps as needed if you experience the same issue when you deploy new revisions of the service.

Zonal clusters

When using a zonal cluster with Cloud Run for Anthos on Google Cloud, access to the control plane is unavailable during cluster maintenance.

During this period, Cloud Run for Anthos on Google Cloud may not work as expected. Services deployed in that cluster

  • Are not shown in the Cloud Console or via gcloud SDK
  • Cannot be deleted or updated
  • Will not automatically scale instances, but existing instances will continue to serve new requests

To avoid these issues, you can use a regional cluster, which ensures a high availability control plane.

Default memory limit is not enforced through command line

When deploying using the command line, unless the --memory argument is used the deployed service will not have a memory limit. This allows the service to consume as much memory as available on the node where the pod is running, and may have unexpected side effects.

When deploying through the UI, the default value of 256M is used unless the value is overridden.

You can workaround this by defining a default memory limit for the namespace where you are deploying services with Cloud Run on GKE. For more information see Configuring default memory limits in the Kubernetes documentation.

Default CPU limit is not enabled

When deploying using the command line or Console, the amount of CPU a service can use is not defined. This allows the service to consume all available CPU in the node where it is running, which may have unexpected side effects.

You can workaround this by defining a default CPU limit for the namespace where you are deploying services with Cloud Run on GKE. For more information see Configuring default CPU limits in the Kubernetes documentation.

Note: By default, services deployed with Cloud Run for Anthos on Google Cloud request 400m CPU, which is used to schedule instances of a service on the cluster nodes.

Istio 1.0 limitations

Cloud Run for Anthos on Google Cloud uses Istio 1.0 for networking, which limits the number of services and revisions that can exist in a cluster. For more information on these limitations, see Istio 1.0 performance and scalability.

Cloud Run for Anthos on Google Cloud should not be used to deploy more than 150 services or 300 active revisions in the same cluster.

Contents of read/write points in the container are always empty

If you have a container creates files or folders in /var/log, for example, /var/log/nginx, when you run that container in Cloud Run for Anthos on Google Cloud, those files or folders will not be visible because an empty read/write volume has been mounted on /var/log, which hides the contents of the underlying container.

If your service needs to write to a subdirectory of /var/log, the service must ensure that the folder exists at runtime before writing into the folder. It cannot assume that the folder exists from the container image.

Custom domain mapping fails in clusters upgraded to GKE 1.17

There is a known issue in Cloud Run for Anthos clusters with custom domain mappings. When those clusters are upgraded from GKE version 1.15 or 1.16 to GKE version 1.17, all deployments of new services fail.

If your cluster has a custom domain mapping, you should not upgrade them to GKE version 1.17.

Workaround

If you upgraded your cluster to GKE version 1.17 and are experiencing issues with service deployment, you must delete the VirtualService that was generated by the DomainMapping because it is no longer compatible with the new controller. After deleting it, the controller recreates a compatible VirtualService and resolves your service deployment issues.

Run the following commands to delete your VirtualService, where the name of the VirtualService is the same as your DomainMappings. For example: foo.example.com

  1. Run the following command to list all of your DomainMappings:

    kubectl get domainmapping --all-namespaces
    
  2. Run the following command to delete the specified VirtualService:

    kubectl delete vs your-domain-mapping-name -n your-domain-mapping-namespace