This page provides troubleshooting strategies as well as solutions for some common problems.
If you see the "Container failed to start. Failed to start and then listen on the port defined by the PORT environment variable." error message when deploying a new revision, use these steps to troubleshoot the problem.
Does your container run locally?
When troubleshooting Cloud Run, your first step should always be to confirm that you can run your container image locally. If your container image is not running locally, the root cause of the problem is not coming from Cloud Run. You need to diagnose and fix the issue locally first.
Is your container listening for requests on the expected port?
A common issue is to forget to listen for incoming requests, or to listen for incoming requests on the wrong port.
As documented in the container runtime contract,
your container must listen for incoming requests on the port that
is defined by Cloud Run and provided in the
PORT environment variable.
If your container fails to listen on the expected port, the revision health check will fail, the revision will be in an error state and the traffic will not be routed to it.
Is your container listening on all network interfaces?
A common reason for Cloud Run services failing to start is that
the server process inside the container is configured to listen on the
127.0.0.1) address. This refers to the loopback network interface, which is
not accessible from outside the container and therefore Cloud Run
health check cannot be performed, causing the service deployment failure.
To solve this, configure your application to start the HTTP server to listen
on all network interfaces, commonly denoted as
Do you see errors in the logs?
Use Cloud Logging to look for application errors in
logs as described in the Cloud Run logging page.
You can also look for crashes captured in Error Reporting, as described
in the Cloud Run error reporting page.
See also the troubleshooting tutorial.
You probably need to update your code or your revision settings to fix the errors or crashes.
Are your container instances exceeding memory?
Your container instances might be exceeding the available memory.
To determine if this is the case, look for such errors in the
logs. If the instances are exceeding the available memory, consider
increasing the memory limit.
Note that the Cloud Run container instances run in an environment where
the files written to the local filesystem count towards the available memory.
This also includes any log files that are not written to
Error: Forbidden when opening or calling the service URL
If you receive a
403 "Error: Forbidden" error message when accessing your
Cloud Run service, it means that your client is not authorized to
invoke this service. You can address this by taking one of the following
- If the service is meant to be invocable by anyone, update its IAM settings to make the service public.
- If the service is meant to be invocable only by certain identities, make sure that you invoke it with the proper authorization token.
SIGNATURE_REMOVED_BY_GOOGLE when using ID token
This can be observed during development and testing in the following circumstance:
- User logs in with the user accounts using
gcloudcommand line or Cloud Shell.
- User generates an ID token via the
- User then tries to use the ID token to invoke a non-public cloud run service
This is by design. The token signature is removed for security concerns, to prevent any non-public Cloud Run service to replay ID tokens generated in the way described above. Instead, to invoke a private service with a new ID token, refer to Testing authentication in your service.
Do you see error code 203 for long running requests?
If your service is processing long requests and you have increased the request timeout,
you might still see requests being terminated earlier, with error code 203.
This can be caused by your language framework's request timeout setting that
you also need to update. For example, Node.js developers need to update the
Do you see 503 errors under high load?
The Cloud Run load balancer strives to distribute incoming requests over the necessary amount of container instances. However, if your container instances are using a lot of CPU to process requests, the container instances will not be able to process all of the requests, and some requests will be returned with a 503 error code.
Do you see 429 errors?
If the service has reached its maximum number of container instances, requests are returned with a 429 error code. Try increasing this limit by increasing the "max instance" settings, or, if you need more than 1000 instances, by requesting a quota increase.
Do your requests abort because no instance is available?
You can get the following error if the Cloud Run infrastructure didn't scale fast enough to catch up with the traffic spike:
The request was aborted because there was no available instance
This issue can be caused by one of the following:
- A huge sudden increase in traffic
- A long cold start time
- A long request processing time
- The service has reached its maximum container instance limit
Are you unable to deploy or to call other Google Cloud APIs?
Confirm that the Cloud Run service agent has
not been deleted and that is has the "Cloud Run Service Agent" IAM role.
It has an email ending with
Error: The feature is not supported in the declared launch stage
If you call the Cloud Run Admin API directly and use a beta feature without specifying a launch stage annotation, you can get the following error:
The feature is not supported in the declared launch stage
Direct Cloud Run Admin API users must annotate the resource with a
run.googleapis.com/launch-stage annotation as
BETA in the request if any beta
feature is used.
The following example adds a launch stage annotation to a service request:
kind: Service metadata: annotations: run.googleapis.com/launch-stage: BETA
Is your issue caused by a limitation in the container sandbox?
If your container runs locally but fails in Cloud Run, the Cloud Run container sandbox might be responsible for the failure of your container.
In the Cloud Logging section of the GCP Console (not in the "Logs"
tab of the Cloud Run section), you can look for
"Container Sandbox" with a "DEBUG" severity in the
varlog/system logs or use the
Container Sandbox: Unsupported syscall setsockopt(0x3,0x1,0x6,0xc0000753d0,0x4,0x0)
If you suspect that these might be responsible for the failure of your container, contact Support and add the log message to the support ticket. The Google Cloud support might ask you to trace system calls made by your service to diagnose lower-level system calls that are not surfaced in the Cloud Logging logs.