Cloud Quotas overview

Google Cloud uses quotas to help ensure fairness and reduce spikes in resource use and availability. A quota restricts how much of a Google Cloud resource your Google Cloud project can use. Quotas apply to a range of resource types, including hardware, software, and network components. For example, quotas can restrict the number of API calls to a service, the number of load balancers used concurrently by your project, or the number of projects that you can create. Quotas protect the community of Google Cloud users by preventing the overloading of services. Quotas also help you to manage your own Google Cloud resources.

The Cloud Quotas system does the following:

  • Monitors your consumption of Google Cloud products and services
  • Restricts your consumption of those resources
  • Provides a means to request changes to the quota value

In most cases, when you attempt to consume more of a resource than its quota allows, the system blocks access to the resource, and the task that you're trying to perform fails.

Quotas generally apply at the Google Cloud project level. Your use of a resource in one project doesn't affect your available quota in another project. Within a Google Cloud project, quotas are shared across all applications and IP addresses.

Many services also have system limits that are unrelated to the quota system. System limits are fixed constraints, such as maximum file sizes or database schema limitations, which cannot be increased or decreased.

To learn about the quotas and system limits for a product, see the product's quotas and limits page—for example, Cloud Storage quotas and limits.

The following links provide additional information related to resource usage:

Try it for yourself

If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

Get started for free

Types of quotas

Google Cloud has three types of quotas:

  • Rate quotas are typically used for limiting the number of requests that you can make to an API or service. Rate quotas reset after a time interval that is specific to the service—for example, the number of API requests per day.
  • Allocation quotas are used to restrict the use of resources that don't have a rate of usage. For example, the number of VM instances used by your project at a given time. Allocation quotas don't reset over time. Instead, they must be explicitly released when you no longer want to use them—for example, by deleting a GKE cluster.
  • Concurrent quotas are used to restrict the total number of concurrent operations in flight at any given time. These are usually long-running operations. For example, Compute Engine uses insert operations that are expected to last as long as one hour.

Within these categories, some quotas are global and apply to your usage of the resource anywhere in Google Cloud. Others are regional or zonal and apply to your usage of the resource in a specific Google Cloud region or zone (for allocation quotas only). For example, there are separate limits for how many Compute Engine VM instances that you can create in each Google Cloud region.

Quotas are enforced for each project, except for the number of projects that you can create, which is enforced for each user account and billing account.

What's next