This page contains usage quota and limits that apply when using Cloud Run.
The number of Cloud Run resources is limited. Quotas for Cloud Run encompass API rate limits, which affect the rate at which you can call the Cloud Run Admin API.
There is no direct limit for:
- The size of container images you can deploy.
- The number of concurrent requests served by a Cloud Run service.
Resource limits for Cloud Run
Resource | Description | Limit | Can be increased | Scope |
---|---|---|---|---|
Service | Maximum number of services | 1000 | No | per project and region |
Job | Maximum number of jobs | 1000 | No | per project and region |
Revision | Maximum number of revisions per service. When limit is reached, non-serving revisions are automatically deleted in historical order | 1000 | No | per service |
Revision | Maximum number of revisions serving traffic | 4000 | No | per project and region |
Revision tag | Maximum number of revision tags | 2000 | No | per project and region |
Job execution | Retention limit for completed job executions. When the number of completed executions for a job reaches this limit, executions are automatically deleted in historical order | 1,000 | No | per job |
Job execution task1 | Maximum number of tasks running in parallel | 200 when using 1 CPU and 2GiB memory, depends on CPU and memory configurations. This limit might be greater in high-capacity regions or lower in recently opened regions. | Yes | per job execution |
Job execution task | Maximum tasks timeout value | 1 hour | No | per job execution |
Job execution task | Maximum number of tasks in a single job | 10,000 | No | per job execution |
Job execution task | Maximum number of task retries in a job | 10 | No | per job execution |
Container instance1 | Maximum number of container instances | Depends on selected region and CPU and memory configurations. This limit might be greater in high-capacity regions or lower in recently opened regions. | Yes | per revision |
Container instance1 | Maximum number of container instances of all running job executions | 200 when using 1 CPU and 2GiB memory, depends on CPU and memory configurations. This limit might be greater in high-capacity regions or lower in recently opened regions. | Yes | per project and region |
Container instance | Startup timeout, in minutes | 4 | No | per container instance |
Container instance | Outbound connections per second | 700 | No | per container instance |
Container instance | Inbound requests per second to an HTTP/1 container port (doesn't apply to HTTP/2 container ports) | 800 | No | per container instance |
Memory | Maximum memory size, in GiB | 32 | No | per container instance |
CPU | Maximum number of vCPU | 8 | No | per container instance |
Disk | Maximum writable, in-memory filesystem, limited by instance memory, in GiB | 32 | No | per container instance |
Environment variable | Maximum variable length, in Kb | 32 | No | per variable |
Request | Maximum number of concurrent requests | 1000 | No | per container instance |
Request | Maximum number of concurrent streams | 100 | No | per HTTP/2 client connection |
Request | Maximum time before timeout, in minutes | 60 | No | per request |
Request | Maximum HTTP/1 request size, in MiB | 32 if using HTTP/1 server. No limit if using HTTP/2 server. | No | per request |
Response | Maximum HTTP/1 response size, in MiB | 32 if not using Transfer-Encoding: chunked or streaming mechanisms |
No | per response |
Domain mapping | Maximum number of SSL certificates | 50 | No | per top domain and per week |
Domain mapping | Maximum number of duplicate SSL certificates | 5 | No | per top domain and per week |
Access token | Maximum number of unique access token generated | 50 | No | per container instance per second |
Identity token | Maximum number of unique identity token generated | 50 | No | per container instance per second |
1This regional quota is used in a few cases:
It controls the maximum value that can be picked for the maximum instance attribute of a revision. Once it is granted in a given region, all revisions in that region can go up to the granted limit.
It controls the maximum parallelism of a job. Once it is granted in a given region, all jobs in that region can go up to the granted limit.
It controls the total container instances for running job executions in a region.
NOTE: When this quota is used for jobs, it is divided by 5 first.
Cloud Run Admin API rate Limits
The following rate limits apply to the Cloud Run Admin API. They do not apply to the requests reaching your deployed Cloud Run services.
Quota | Description | Limit | Increasable? | Scope |
---|---|---|---|---|
Cloud Run Admin API read requests | The number of API reads per 60 seconds per project. This is not the number of read requests to your Cloud Run services, which is not limited. | 1,200 per 60 seconds | Yes | Regional |
Cloud Run Admin API write requests | The number of API writes per 60 seconds per project. This is not the number of write requests to your Cloud Run services, which is not limited. | 60 per 60 seconds | Yes | Regional |
Job Run | Maximum number of times a job can be executed per minute per region | 10 | Yes | per project and region |
How to increase quota
To increase quotas above the defaults listed on this page:
Go to the Cloud Run Quotas page.
Select the quota(s) you want to modify for applicable regions and click EDIT QUOTAS.
If prompted, provide your user information, and enter the new quota limit for each quota you selected.
Your request will be routed to the support team to ensure Cloud Run can handle your use case in the selected region. You may be asked to provide details about your configuration and expected traffic patterns before the request is granted. Large increase requests may take some time to process.