This page contains usage quota and limits that apply when using Cloud Run.
The number of Cloud Run resources is limited. Quotas for Cloud Run encompass API rate limits, which affect the rate at which you can call the Cloud Run Admin API.
There is no direct limit for:
- The size of container images you can deploy.
- The number of concurrent requests served by a Cloud Run service.
Resource limits for Cloud Run
To go beyond limits that can be increased, request a quota increase. To go beyond per project limits that cannot be increased, create new resources in a different Google Cloud project or region.
Resource | Scope | Description | Limit | Can be increased |
---|---|---|---|---|
Service | per project and region | Maximum number of services | 1000 | No |
Job | per project and region | Maximum number of jobs | 1000 | No |
Job execution | per project and region | Maximum number of running Job executions | 1000 | No |
Revision | per service | Maximum number of revisions per service. When limit is reached, non-serving revisions are automatically deleted in historical order | 1000 | No |
Revision | per project and region | Maximum number of revisions serving traffic | 4000 | No |
Revision tag | per project and region | Maximum number of revision tags. When the revision tag limit is exceeded, Cloud Run executes tag cleanup on the service. For the service for which a new tag is being created, tags that don't have a specified traffic percentage are automatically deleted in historical order. | 2000 | No |
Job execution | per job | Retention limit for completed job executions. When the number of completed executions for a job reaches this limit, executions are automatically deleted in historical order | 1,000 | No |
Job execution task1 | per job execution | Maximum number of tasks running in parallel | Depends on selected region and CPU and memory configurations. This limit might be greater in high-capacity regions or lower in recently opened regions. | Yes |
Job execution task | per job execution | Maximum tasks timeout value | 168 hours (7 days)2 | No |
Job execution task | per job execution | Maximum number of tasks in a single job | 10,000 | No |
Job execution task | per job execution | Maximum number of task retries in a job | 10 | No |
Environment variables | per job or per service | Maximum number of environment variables for each container | 1000 | No |
Command arguments | per job or per service | Maximum number of command arguments for each container | 1000 | No |
Container instance1 | per revision | Maximum number of container instances | Depends on selected region and CPU and memory configurations. This limit might be greater in high-capacity regions or lower in recently opened regions. | Yes |
Container instance1 | per project and region | Maximum number of container instances of all running job executions | Depends on selected region and CPU and memory configurations. This limit might be greater in high-capacity regions or lower in recently opened regions. | Yes |
Container instance3 | per revision and region | Maximum number of container instances using Direct VPC egress | 100-200, depending on selected region configurations. | Yes |
Container instance | per container instance | Startup timeout, in minutes | 4 | No |
Memory | per container instance | Maximum memory size, in GiB | 32 | No |
CPU | per container instance | Maximum number of vCPU | 8 | No |
CPU | per project and region | Maximum total CPU, in milli vCPU, allocated across all instances over a 1 minute period. | Depends on selected region. This limit might be greater in high-capacity regions or lower in recently opened regions. | Yes |
Memory | per project and region | Maximum total memory, in bytes, allocated across all instances over a 1 minute period. | Depends on selected region. This limit might be greater in high-capacity regions or lower in recently opened regions. | Yes |
GPU instance4 | per project per region | Maximum number of container instances with GPU | 04 | Yes |
Disk | per container instance | Maximum writable, in-memory filesystem, limited by instance memory, in GiB | 32 | No |
Environment variable | per variable | Maximum variable length, in Kb | 32 | No |
Domain mapping | per top domain and per week | Maximum number of SSL certificates | 50 | No |
Domain mapping | per top domain and per week | Maximum number of duplicate SSL certificates | 5 | No |
Access token | per container instance per second | Maximum number of unique access token generated | 50 | No |
Identity token | per container instance per second | Maximum number of unique identity token generated | 50 | No |
Files opened | per container instance | Maximum number of files that can be opened at the same time. Corresponds to /proc/sys/fs/file-max . |
25000 | No |
1This regional quota is used in a few cases:
It controls the maximum value that can be picked for the maximum instance attribute of a revision. Once it is granted in a given region, all revisions in that region can go up to the granted limit.
It controls the maximum parallelism of a job. Once it is granted in a given region, all jobs in that region can go up to the granted limit.
It controls the total container instances for running job executions in a region.
2Support for timeouts greater than 24 hours is available in Preview.
3This regional quota is used in a few cases:
It controls the maximum value that can be picked for the maximum instance attribute of a revision using Direct VPC egress. Once it is granted in a given region, all revisions in that region can go up to the granted limit.
It controls the maximum parallelism of a job using Direct VPC egress. Once it is granted in a given region, all jobs in that region can go up to the granted limit.
4
To access GPU, you must request a quota increase for Total Nvidia L4 GPU allocation, per project per region
.
If your peak GPU usage is not close to your quota, Google might decrease your quota.
Networking limits for Cloud Run
Request limits for Cloud Run
Description | Limit | Notes |
---|---|---|
Maximum number of concurrent requests per instance | 1000 | |
Maximum number of concurrent streams per HTTP/2 client connection | 100 | |
Maximum time before timeout per request | 60 minutes | |
Maximum HTTP/1 request size | 32 MiB if using HTTP/1 server. No limit if using HTTP/2 server. | |
Maximum HTTP/1 response size | 32 MiB if not using Transfer-Encoding: chunked or streaming mechanisms |
|
Outbound connections per second per instance | 700 | |
Outbound DNS resolutions per second per instance | 1000 | |
Inbound requests per second to an HTTP/1 container port per instance | 800 | Doesn't apply to HTTP/2 container ports. |
Bandwidth limits for Cloud Run
The following bandwidth limits apply to Cloud Run instances:
Description | Limit | Notes |
---|---|---|
Maximum bits per instance for egress over Direct VPC | 1 Gbps | Egress over Direct VPC egress to destinations on the VPC network. |
Maximum bits per instance, excluding egress over Direct VPC | 600 Mbps | Based on the sum of ingress and egress bits, excluding egress over Direct VPC egress to destinations on the VPC network. |
Maximum total packet rate per instance, excluding egress over Direct VPC | 64,000 packets per second | Based on the sum of ingress packets and egress packets, excluding egress over Direct VPC egress to destinations on the VPC network. |
If either of the two limits is reached, the Cloud Run instance will have limited bandwidth.
Cloud Run Admin API rate Limits
The following rate limits apply to the Cloud Run Admin API. They do not apply to the requests reaching your deployed Cloud Run services.
Quota | Description | Limit | Can be increased | Scope |
---|---|---|---|---|
Cloud Run Admin API read requests | The number of API reads per 60 seconds per region. This is not the number of read requests to your Cloud Run services, which is not limited. | 3,000 per 60 seconds | Yes | per project and region |
Cloud Run Admin API write requests | The number of API writes per 60 seconds per region. This is not the number of write requests to your Cloud Run services, which is not limited. | 180 per 60 seconds | Yes | per project and region |
Job Run | Maximum number of times a job can be executed per 60 seconds per region. | 180 per 60 seconds | Yes | per project and region |
How to increase quota
To request a higher quota value, follow these steps:Go to the Quotas & System Limits page:
Find the quota you want to increase in the Quota column. You can use the Filter search box to search for your quota.
Select the checkbox next to the quota that you want to increase.
Click
. The Quota changes dialog appears. Edit Optional: If you want to increase your quota value beyond the number indicated on the screen, select Apply for higher quota. Fill out the form, submit, and skip the remaining steps.
In the Quota changes form, enter the increased quota that you want in the New value field. If a Request description field appears, enter a description. Click Done.
If you see a checkbox with the text "I understand that this request will remove any overrides," your quota value is set below the default. Adjusting the quota value to or beyond the default removes the override. To proceed, select the checkbox. Learn more about quota overrides.
If a Next button appears, click Next and fill out your contact details in the screen that follows.
Click Submit request.
If you find that you can't request an adjustment from the console, request the increase from Cloud Customer Care.
Cloud Quotas adjustment requests are subject to review. If your quota adjustment request requires review, you receive an email acknowledging receipt of your request. If you need further assistance, respond to the email. After reviewing your request, you receive an email notification indicating whether your request was approved.
To find out more about how the quota increase process works, see About quota increase requests.
Batching requests for higher quota values
You can batch requests for higher quota by selecting the checkbox next to each quota that you want to include. Batching requests can increase the amount of time it takes for Google Cloud to review your request. To reduce review time, group quota adjustment requests by product and area. For example, if you want to request adjustments to networking and Compute Engine VM quotas, create one request for the networking quotas and another request for the Compute Engine VM quotas.