This page lists key requirements and behaviors of containers in Cloud Run.
Supported languages and images
Your container image can run code written in the programming language of your choice and use any base image, provided that it respects the constraints listed in this page.
Executables in the container image must be compiled for Linux 64-bit. Cloud Run specifically supports the Linux x86_64 ABI format.
Listening for requests on the correct port
The container must listen for requests on
0.0.0.0 on the port to which
requests are sent. By default, requests are sent to
8080, but you can
configure Cloud Run
to send requests to the port of your choice. Inside Cloud Run container
instances, the value of the
PORT environment variable always reflects the port
to which requests are sent. It defaults to
Transport layer encryption (TLS)
The container should not implement any transport layer security directly. TLS is terminated by Cloud Run for HTTPS and gRPC, and then requests are proxied as HTTP or gRPC to the container without TLS.
Your container instance must send a response within the time specified in the request timeout setting after it receives a request, including the container instance startup time. Otherwise the request is ended and a 504 error is returned.
A container instance is terminated if it returns more than
5xx sequential responses.
The following environment variables are automatically added to the running containers:
||The port your HTTP server should listen on.||
||The name of the Cloud Run service being run.||
||The name of the Cloud Run revision being run.||
||The name of the Cloud Run configuration that created the revision.||
The filesystem of your container is writable and is subject to the following behavior:
- This is an in-memory filesystem, so writing to it uses the container instance's memory.
- Data written to the filesystem does not persist when the container instance is stopped.
Container instance lifecycle
In response to incoming requests, a service is automatically scaled to a certain number of container instances, each of which runs the deployed container image.
When a revision does not receive any traffic, it is scaled in to the minimum number of container instances configured (zero by default).
Your container instances must listen for requests within 4 minutes after being started. During this startup time, container instances are allocated CPU.
Computation is scoped to a request
After startup, you should only expect to be able to do computation within the scope of a request: a container instance does not have any CPU allocated if it is not processing a request.
A container instance can be shut down at any time.
When a container instance needs to be shut down, new incoming requests are
routed to other instances and requests currently being processed are given time
The container instance then receives a
indicating the start of a 10 second period
before being shut down (with a
During this period, the container instance is allocated CPU and billed.
If the container instance does not catch the
SIGTERM signal, it is
immediately shut down.
Unless a container instance must be kept idle due to the minimum number of container instances configuration setting, it will not be kept idle for longer than 15 minutes.
Container instance resources
Cloud Run allocates 1 vCPU per container instance by default, but this can be changed.
A vCPU is implemented as an abstraction of underlying hardware to provide the approximate equivalent CPU time of a single hardware hyper-thread on variable CPU platforms. The container instance may be executed on multiple cores simultaneously. The vCPU is only allocated during container instance startup and request processing, it is throttled otherwise. CPU throttling occurs independent of minimum instances configuration, idle instances are not shutdown, which avoids cold starts, but do not provide full access to CPU between requests.
To allocate a different vCPU value, refer to the documentation for allocating CPU.
Each Cloud Run container instance by default gets 256 MiB of memory. You can change this by configuring memory limits, up to a maximum of 8 GiB.
Typical uses of memory include:
- Code loaded into memory to run the service
- Writing to the filesystem
- Extra processes running in the container such as an nginx server
- In-memory caching systems such as the PHP OpCache
- Per request memory usage
Each Cloud Run container instance by default is set to multiple concurrency, where each container instance can receive more than one request at the same time. You can change this by setting concurrency.
Container instance sandbox
The Cloud Run (fully managed) container instances are sandboxed using the gVisor container runtime sandbox. As documented in the syscall compatibility reference, some system calls might not be supported by this container sandbox.
Container instance metadata server
Cloud Run container instances expose a metadata server that you can use to retrieve details about your container instance, such as the project ID, region, instance ID or service accounts. It can also be used to generate tokens for the runtime service account.
You can access this data from the metadata server using simple HTTP requests to the
http://metadata.google.internal/ endpoint with the
header: no client libraries are required. For more information, see
The following table lists some of the available metadata server information:
||Project ID of this Cloud Run service|
||Region of this Cloud Run service|
||Unique identifier of the container instance (also available in logs).|
||Generates a token for the runtime service account of this Cloud Run service|
Note that Cloud Run (fully managed) does not provide details about which
Google Cloud zone the container instances
are running in. As a consequence, the metadata attribute