Cloud Run Quotas and Limits

This page contains usage quota and limits that apply when using Cloud Run.

The number of Cloud Run resources is limited. Quotas for Cloud Run encompass API rate limits, which affect the rate at which you can call the Cloud Run Admin API.

There is no direct limit for:

The size of container images you can deploy.
The number of concurrent requests served by a Cloud Run service.

Resource limits for Cloud Run

To go beyond limits that can be increased, request a quota increase. To go beyond per project limits that cannot be increased, create new resources in a different Google Cloud project or region.

Resource	Scope	Description	Limit	Can be increased
Service	per project and region	Maximum number of services	1000	No
Job	per project and region	Maximum number of jobs	1000	No
Job execution	per project and region	Maximum number of running Job executions	1000	No
Revision	per service	Maximum number of revisions per service. When limit is reached, non-serving revisions are automatically deleted in historical order	1000	No
Revision	per project and region	Maximum number of revisions serving traffic	4000	No
Revision tag	per project and region	Maximum number of revision tags. When the revision tag limit is exceeded, Cloud Run executes tag cleanup on the service. For the service for which a new tag is being created, tags that don't have a specified traffic percentage are automatically deleted in historical order.	2000	No
Job execution	per job	Retention limit for completed job executions. When the number of completed executions for a job reaches this limit, executions are automatically deleted in historical order	1,000	No
Job execution task¹	per job execution	Maximum number of tasks running in parallel	Depends on selected region and CPU and memory configurations. This limit might be greater in high-capacity regions or lower in recently opened regions. You can view your quota in the Quotas and system limits console page.	Yes
Job execution task	per job execution	Maximum tasks timeout value	168 hours (7 days)²	No
Job execution task	per job execution	Maximum number of tasks in a single job	10,000	No
Job execution task	per job execution	Maximum number of task retries in a job	10	No
Environment variables	per job or per service	Maximum number of environment variables for each container	1000	No
Command arguments	per job or per service	Maximum number of command arguments for each container	1000	No
Container instance¹	per revision	Maximum number of container instances	Depends on selected region, and CPU and memory configurations. This limit might be greater in high-capacity regions or lower in recently opened regions. You can view your quota in the Quotas and system limits console page.	Yes
Container instance¹	per project and region	Maximum number of container instances of all running job executions	Depends on selected region, and CPU and memory configurations. This limit might be greater in high-capacity regions or lower in recently opened regions. You can view your quota in the Quotas and system limits console page.	Yes
Container instance³	per revision and region	Maximum number of container instances using Direct VPC egress	100-200, depending on selected region configurations.	Yes
Container instance	per container instance	Startup timeout, in minutes	4	No
Memory	per container instance	Maximum memory size, in GiB	32	No
CPU	per container instance	Maximum number of vCPU	8	No
CPU	per project and region	Maximum total CPU, in milli vCPU, allocated across all instances over a 1 minute period.	Depends on selected region. This limit might be greater in high-capacity regions or lower in recently opened regions.	Yes
Memory	per project and region	Maximum total memory, in bytes, allocated across all instances over a 1 minute period.	Depends on selected region. This limit might be greater in high-capacity regions or lower in recently opened regions. You can view your quota in the Quotas and system limits console page.	Yes
GPU instance⁴	per project per region	Maximum number of container instances with GPU	0⁴	Yes
Disk	per container instance	Maximum writable, in-memory filesystem, limited by instance memory, in GiB	32	No
Environment variable	per variable	Maximum variable length, in Kb	32	No
Domain mapping	per top domain and per week	Maximum number of SSL certificates	50	No
Domain mapping	per top domain and per week	Maximum number of duplicate SSL certificates	5	No
Access token	per container instance per second	Maximum number of unique access token generated	50	No
Identity token	per container instance per second	Maximum number of unique identity token generated	50	No
Files opened	per container instance	Maximum number of files that can be opened at the same time. Corresponds to `/proc/sys/fs/file-max`.	25000	No

¹This regional quota is used in a few cases:

It controls the maximum value that can be picked for the maximum instance attribute of a revision. Once it is granted in a given region, all revisions in that region can go up to the granted limit.

It controls the maximum parallelism of a job. Once it is granted in a given region, all jobs in that region can go up to the granted limit.

It controls the total container instances for running job executions in a region.

²Support for timeouts greater than 24 hours is available in Preview.

³This regional quota is used in a few cases:

It controls the maximum value that can be picked for the maximum instance attribute of a revision using Direct VPC egress. Once it is granted in a given region, all revisions in that region can go up to the granted limit.

It controls the maximum parallelism of a job using Direct VPC egress. Once it is granted in a given region, all jobs in that region can go up to the granted limit.

⁴ To access GPU, you must request a quota increase for Total Nvidia L4 GPU allocation, per project per region. If your peak GPU usage is not close to your quota, Google might decrease your quota.

Networking limits for Cloud Run

Request limits for Cloud Run

Description	Limit	Notes
Maximum number of concurrent requests per instance	1000
Maximum number of concurrent streams per HTTP/2 client connection	100
Maximum time before timeout per request	60 minutes
Maximum HTTP/1 request size	32 MiB if using HTTP/1 server. No limit if using HTTP/2 server.
Maximum HTTP/1 response size	32 MiB if not using `Transfer-Encoding: chunked` or streaming mechanisms
Outbound connections per second per instance	700	Doesn't apply to Direct VPC egress traffic sent to the VPC network, which isn't limited.
Outbound DNS resolutions per second per instance	1000
Inbound requests per second to an HTTP/1 container port per instance	800	Doesn't apply to HTTP/2 container ports.

Bandwidth limits for Cloud Run

The following bandwidth limits apply to Cloud Run instances:

Description	Limit	Notes
Maximum bits per instance for egress over Direct VPC	1 Gbps	Egress over Direct VPC egress to destinations on the VPC network.
Maximum bits per instance, excluding egress over Direct VPC	600 Mbps	Based on the sum of ingress and egress bits, excluding egress over Direct VPC egress to destinations on the VPC network.
Maximum total packet rate per instance, excluding egress over Direct VPC	64,000 packets per second	Based on the sum of ingress packets and egress packets, excluding egress over Direct VPC egress to destinations on the VPC network.

If either of the two limits is reached, the Cloud Run instance will have limited bandwidth.

Cloud Run Admin API rate Limits

The following rate limits apply to the Cloud Run Admin API. They do not apply to the requests reaching your deployed Cloud Run services.

Quota	Description	Limit	Can be increased	Scope
Cloud Run Admin API read requests	The number of API reads per 60 seconds per region. This is not the number of read requests to your Cloud Run services, which is not limited.	3,000 per 60 seconds	Yes	per project and region
Cloud Run Admin API write requests	The number of API writes per 60 seconds per region. This is not the number of write requests to your Cloud Run services, which is not limited.	180 per 60 seconds	Yes	per project and region
Job Run	Maximum number of times a job can be executed per 60 seconds per region.	180 per 60 seconds	Yes	per project and region

How to increase quota

Cloud Quotas adjustment requests are subject to review. If your quota adjustment request requires review, you receive an email acknowledging receipt of your request. If you need further assistance, respond to the email. After reviewing your request, you receive an email notification indicating whether your request was approved.

Console

To adjust a quota value, follow these steps:

In the Google Cloud console, go to the Quotas & System Limits page:
Go to Quotas & System Limits
Find the quota value that you want to update in the Quota column and select the checkbox next to the quota that you want to update.

If needed, use the Filter search box to search for your quota.
Click Edit and the Quota changes dialog appears.
Enter the quota value that you want in the New value field. Some quota values have a unit of measurement; if this applies, select the unit that you want in the adjacent list. Click Done.

Optional: If you see a checkbox with the text I understand that this request will remove any overrides, it means that adjusting the quota value to a number equal to or greater than the default will remove the previous quota override. If this is what you want, select the checkbox and proceed.
To increase your quota value greater than the number indicated on the dialog, select Apply for higher quota.
1. In the Quota changes form, enter the updated quota value that you want in the New value field. If a Request description field appears, enter a description. Click Done.
2. If a Next button appears, click Next and fill out your contact details in the screen that follows.
Click Submit request.

If you find that you can't request an adjustment from the console, request the increase from Cloud Customer Care.

To learn more about how the quota increase process works, see About quota adjustments.

Batching requests for higher quota values

You can batch requests for higher quota by selecting the checkbox next to each quota that you want to include. However, batching requests can increase the amount of time it takes for Google Cloud to review your request.

To reduce review time, group quota adjustment requests by product and area. For example, if you want to request adjustments to networking and Compute Engine VM quotas, create one request for the networking quotas and another request for the Compute Engine VM quotas.