This page contains usage quota and limits that apply when using Cloud Run.
The number of Cloud Run resources is limited. Quotas for Cloud Run encompass API rate limits, which affect the rate at which you can call the Cloud Run Admin API.
There is no direct limit for:
- The size of container images you can deploy.
- The number of concurrent requests served by a Cloud Run service.
Resource limits for Cloud Run
Resource | Description | Limit | Can be increased | Scope |
---|---|---|---|---|
Service | Maximum number of services | 1000 | No | per project and region |
Job | Maximum number of jobs | 1000 | No | per project and region |
Job execution | Maximum number of running Job executions | 1000 | No | per project and region |
Revision | Maximum number of revisions per service. When limit is reached, non-serving revisions are automatically deleted in historical order | 1000 | No | per service |
Revision | Maximum number of revisions serving traffic | 4000 | No | per project and region |
Revision tag | Maximum number of revision tags. When the number of revision tags reaches this limit, tags that don't have a traffic percentage are automatically deleted in historical order. | 2000 | No | per project and region |
Job execution | Retention limit for completed job executions. When the number of completed executions for a job reaches this limit, executions are automatically deleted in historical order | 1,000 | No | per job |
Job execution task1 | Maximum number of tasks running in parallel | Depends on selected region and CPU and memory configurations. This limit might be greater in high-capacity regions or lower in recently opened regions. | Yes | per job execution |
Job execution task | Maximum tasks timeout value | 24 hours | No | per job execution |
Job execution task | Maximum number of tasks in a single job | 10,000 | No | per job execution |
Job execution task | Maximum number of task retries in a job | 10 | No | per job execution |
Container instance1 | Maximum number of container instances | Depends on selected region and CPU and memory configurations. This limit might be greater in high-capacity regions or lower in recently opened regions. | Yes | per revision |
Container instance1 | Maximum number of container instances of all running job executions | Depends on selected region and CPU and memory configurations. This limit might be greater in high-capacity regions or lower in recently opened regions. | Yes | per project and region |
Container instance2 | Maximum number of container instances using Direct VPC egress | 100-200, depending on selected region configurations. | Yes | per revision and region |
Container instance | Startup timeout, in minutes | 4 | No | per container instance |
Memory | Maximum memory size, in GiB | 32 | No | per container instance |
CPU | Maximum number of vCPU | 8 | No | per container instance |
Disk | Maximum writable, in-memory filesystem, limited by instance memory, in GiB | 32 | No | per container instance |
Environment variable | Maximum variable length, in Kb | 32 | No | per variable |
Domain mapping | Maximum number of SSL certificates | 50 | No | per top domain and per week |
Domain mapping | Maximum number of duplicate SSL certificates | 5 | No | per top domain and per week |
Access token | Maximum number of unique access token generated | 50 | No | per container instance per second |
Identity token | Maximum number of unique identity token generated | 50 | No | per container instance per second |
1This regional quota is used in a few cases:
It controls the maximum value that can be picked for the maximum instance attribute of a revision. Once it is granted in a given region, all revisions in that region can go up to the granted limit.
It controls the maximum parallelism of a job. Once it is granted in a given region, all jobs in that region can go up to the granted limit.
It controls the total container instances for running job executions in a region.
2This regional quota is used in a few cases:
It controls the maximum value that can be picked for the maximum instance attribute of a revision using Direct VPC egress. Once it is granted in a given region, all revisions in that region can go up to the granted limit.
It controls the maximum parallelism of a job using Direct VPC egress. Once it is granted in a given region, all jobs in that region can go up to the granted limit.
Networking limits for Cloud Run
Request limits for Cloud Run
Description | Limit | Notes |
---|---|---|
Maximum number of concurrent requests per instance | 1000 | |
Maximum number of concurrent streams per HTTP/2 client connection | 100 | |
Maximum time before timeout per request | 60 minutes | |
Maximum HTTP/1 request size | 32 MiB if using HTTP/1 server. No limit if using HTTP/2 server. | |
Maximum HTTP/1 response size | 32 MiB if not using Transfer-Encoding: chunked or streaming mechanisms |
|
Outbound connections per second per instance | 700 | |
Inbound requests per second to an HTTP/1 container port per instance | 800 | Doesn't apply to HTTP/2 container ports. |
Bandwidth limits for Cloud Run
The following bandwidth limits apply to Cloud Run instances:
Description | Limit | Notes |
---|---|---|
Maximum bytes per instance for egress over Direct VPC | 1 Gbps | Egress over Direct VPC egress to destinations on the VPC network. |
Maximum bytes per instance, excluding egress over Direct VPC | 75 megabytes per second (MBps) | Based on the sum of ingress bytes and egress bytes, excluding egress over Direct VPC egress to destinations on the VPC network. |
Maximum total packet rate per instance, excluding egress over Direct VPC | 64,000 packets per second | Based on the sum of ingress packets and egress packets, excluding egress over Direct VPC egress to destinations on the VPC network. |
If either of the two limits is reached, the Cloud Run instance will have limited bandwidth.
Cloud Run Admin API rate Limits
The following rate limits apply to the Cloud Run Admin API. They do not apply to the requests reaching your deployed Cloud Run services.
Quota | Description | Limit | Can be increased | Scope |
---|---|---|---|---|
Cloud Run Admin API read requests | The number of API reads per 60 seconds per region. This is not the number of read requests to your Cloud Run services, which is not limited. | 3,000 per 60 seconds | Yes | per project and region |
Cloud Run Admin API write requests | The number of API writes per 60 seconds per region. This is not the number of write requests to your Cloud Run services, which is not limited. | 180 per 60 seconds | Yes | per project and region |
Job Run | Maximum number of times a job can be executed per 60 seconds per region. | 180 per 60 seconds | Yes | per project and region |
How to increase quota
To request a higher quota value using the Google Cloud console:
Go to the Quotas & System Limits page:
Find the quota you want to increase in the Quota column. Use the Filter search box to search for your quota.
Select the checkbox to the left of your quota.
Click
. The Quota changes form displays. EDIT In the Quota changes form, enter the increased quota that you want for your project in the New limit field.
Complete any additional fields in the form, and then click Done.
Click Submit request.
While the previous procedure applies to most quota increase requests, you might encounter one of the following exceptions:
In the case where there is an existing usage cap the usage cap must be removed before the quota can be increased. You will be presented with a disclosure and agreement. The disclosure states that the usage cap will be deleted and the limit will be updated to the default limit immediately. Your quota increase request will then be processed by Google Cloud normally.
Sometimes, the Google Cloud console redirects you to a separate form to request an increased limit. After you submit the form, Google Cloud acknowledges your request by email.
Some quotas cannot be updated using the Google Cloud console. If you find that you cannot change a quota from the console, request the increase from Cloud Customer Care. The Billing team does not handle quota adjustments.
Google recommends that you create a different quota increase request for each class of resources. For example, you should separate the per-project network quota increases from the non-networking Compute Engine quota increases. If different classes of requests are combined, one increase request can delay the batch if its approval requires more evaluation time.
If your quota increase request requires approval, you can expect to receive an email from Google Cloud acknowledging receipt of your request. If you need further assistance, you can respond to the email . Cloud Customer Care typically processes your request within 2-3 business days. Cloud Customer Care then sends you a second email notifying you whether the quota increase was approved or denied. The email provides the effective date of the increase, if applicable.
To find out more about how the quota increase process works, see About quota increase requests.