Troubleshooting Cloud Run

This page provides troubleshooting strategies as well as solutions for some common problems.

If you see the "Container failed to start. Failed to start and then listen on the port defined by the PORT environment variable." error message when deploying a new revision, use these steps to troubleshoot the problem.

Does your container run locally?

When troubleshooting Cloud Run, your first step should always be to confirm that you can run your container image locally. If your container image is not running locally, the root cause of the problem is not coming from Cloud Run. You need to diagnose and fix the issue locally first.

Is your container listening for requests on the expected port?

A common issue is to forget to listen for incoming requests, or to listen for incoming requests on the wrong port.

As documented in the container runtime contract, your container must listen for incoming requests on the port that is defined by Cloud Run and provided in the PORT environment variable.

If your container fails to listen on the expected port, the revision health check will fail, the revision will be in an error state and the traffic will not be routed to it.

Is your container listening on all network interfaces?

A common reason for Cloud Run services failing to start is that the server process inside the container is configured to listen on the localhost (127.0.0.1) address. This refers to the loopback network interface, which is not accessible from outside the container and therefore Cloud Run health check cannot be performed, causing the service deployment failure.

To solve this, configure your application to start the HTTP server to listen on all network interfaces, commonly denoted as 0.0.0.0.

Do you see errors in the logs?

You should use Stackdriver Logging to look for application errors in stdout or stderr logs. You can also look for crashes captured in Stackdriver Error Reporting. You will probably need to update your code or your revision settings to fix these errors or crashes.

Are your container instances exceeding memory?

Your container instances might be exceeding the available memory. To determine if this is the case, look for such errors in the varlog/system logs. If the instances are exceeding the available memory, consider increasing the memory limit.

Note that the Cloud Run container instances run in an environment where the files written to the local filesystem count towards the available memory. This also includes any log files that are not written to /var/log/* or /dev/log.

403 "Error: Forbidden" when opening or calling the service URL?

If you receive a 403 "Error: Forbidden" error message when accessing your Cloud Run service, it means that your client is not authorized to invoke this service. You can address this by taking one of the following actions:

Do you see error code 203 for long running requests?

If your service is processing long requests and you have increased the request timeout, you might still see requests being terminated earlier, with error code 203. This can be caused by your language framework's request timeout setting that you also need to update. For example, Node.js developers need to update the server.timeout property.

Do you see 503 errors under high load?

The Cloud Run load balancer strives to distribute incoming requests over the necessary amount of container instances. However, if your container instances are using a lot of CPU to process requests, the container instances will not be able to process all of the requests, and some requests will be returned with a 503 error code.

To mitigate this, try lowering the concurrency. Start from concurrency = 1 and gradually increase it to find an acceptable value. Refer to Setting concurrency for more details.

Is your issue caused by a limitation in the container sandbox?

If your container runs locally but fails in Cloud Run, the Cloud Run container sandbox might be responsible for the failure of your container.

In the Stackdriver Logging section of the GCP Console (not in the "Logs" tab of the Cloud Run section), you can look for "Container Sandbox Limitations" with a "DEBUG" severity in the varlog/system logs.

varlogs

For example:

Container Sandbox Limitation: Unsupported syscall setsockopt(0x3,0x1,0x6,0xc0000753d0,0x4,0x0)

If you suspect that these might be responsible for the failure of your container, contact Support and add the log message to the support ticket. The GCP support might ask you to trace system calls made by your service to diagnose lower-level system calls that are not surfaced in the Stackdriver logs.

Kunde den här sidan hjälpa dig? Berätta:

Skicka feedback om ...

Cloud Run Documentation