Cloud Quotas overview

Google Cloud uses quotas to restrict how much of a particular shared Google Cloud resource that you can use. Each quota represents a specific countable resource. For example: API calls to a particular service, the number of load balancers used concurrently by your project, or the number of projects that you can create.

This page describes how to work with quotas in your projects. You can find and modify your existing quota limits, how to request higher quotas, and how to monitor quota usage.

Many services also have limits that are unrelated to the quota system. Limits are fixed constraints, such as maximum file sizes or database schema limitations, which cannot be increased or decreased. You can find out about limits on the relevant Quotas and limits page for your service, for example, Cloud Storage quotas and limits.

The following links provide additional information related to resource usage:

Try it for yourself

If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

Get started for free

About quotas

Before learning about how to monitor and manage your quota, it's useful to understand the basics of how the Google Cloud quota system works. This section introduces you to some key quota concepts: quota types, quota limits, and quota increase requests.

Why do we use quotas?

Quotas are enforced for many reasons, including:

  • To protect the community of Google Cloud users by preventing unforeseen spikes in usage and overloaded services.
  • To help you manage resources. For example, you can set your own limits on service usage while developing and testing your applications to avoid unexpected bills from using expensive resources.

Types of quotas

Google Cloud has three types of quotas:

  • Rate quotas are typically used for limiting the number of requests that you can make to an API or service. Rate quotas reset after a time interval that is specific to the service—for example, the number of API requests per day.
  • Allocation quotas are used to restrict the use of resources that don't have a rate of usage. For example, the number of VMs used by your project at a given time. Allocation quotas don't reset over time. Instead, they must be explicitly released when you no longer want to use them—for example, by deleting a GKE cluster.
  • Concurrent quotas are used to restrict the total number of concurrent operations in flight at any given time. These are usually long-running operations. For example, Compute Engine uses insert_operations that are expected to last as long as one hour.

Within these categories, some quotas are global and apply to your usage of the resource anywhere in Google Cloud. Others are regional or zonal and apply to your usage of the resource in a specific Google Cloud region or zone (for allocation quotas only). For example, there are separate limits for how many Compute Engine VM instances that you can create in each Google Cloud region.

Quotas are enforced on a per-project basis—except for the number of projects that you can create, which is enforced per user account and billing account.

What's next