Service Quota Model

This page describes the quota management model for services on Google Cloud. Understanding this quota model is helpful when using the quota management features of the Service Usage. For general information, see Working with Quotas.

A consumer of a service is a project, folder, or organization to which use of the service is attributed.

Metrics and limits

A quota metric is an entity defined by a service that accounts for consumption of the service. A quota metric could count concrete entities such as virtual machine instances, or it could count ephemeral entities such as API requests of a specific type.

A quota limit is an entity defined by a service that specifies how consumption of a quota metric is limited for a consumer. A quota metric may have multiple quota limits on it. There are two types of quota limits: rate limits and allocation limits. A rate limit resets after a specified time, such as a minute or a day. Allocation quota does not reset over time; instead it must be explicitly released when a resource is no longer used.

Quota overrides

Each quota limit has a default value for all consumers, set by the service owner. This default value can be changed by a quota override.

The service owner can apply a quota override to a specific consumer to replace the default value for that consumer. This is called a producer override. For example, a service owner could apply a producer override to grant elevated quota to a specific customer as part of a contract.

The consumer can apply a quota override to their own project, folder, or organization to cap their own usage of a service. This is called a consumer override. For example, a consumer could apply a consumer override to their own project as a cost control measure, to prevent budget overruns.

To apply a consumer override to a service that you consume, use the Service Usage API.

To apply a producer override to a consumer of a service you own, use the Service Consumer Management API.

Computing the quota limit

The following formula computes a consumer's quota limit:

if adminOverride is present,
  upperBound = adminOverride
else if producerOverride is present,
  upperBound = producerOverride
else
  upperBound = defaultLimit

if consumerOverride is present,
  quotaLimit = min(consumerOverride, upperBound)
else
  quotaLimit = upperBound

Regional and zonal quota

A quota limit might be counted globally, or it might be counted separately in each Cloud region or Cloud zone.

For example, assume a service has a global quota limit of 100 API requests per minute. If 80 API requests are made in the us-central1 region, and 70 requests are made in the asia-northeast3, then 150 requests would be counted against the global limit, and some requests would be rejected. However, for a regional limit, each region has its own separate quota of 100 requests per minute. 80 requests in us-central1 would not exceed the limit of 100 requests in that region, and 70 requests in asia-northeast3 would not exceed the limit of 100 requests in that region either, so no requests would be rejected.

A quota override can be applied to all regions at once, or to a specific region. When an override is applied to a specific region, the effective limits of other regions are unaffected.

Similarly, a quota override can be applied to all zones at once, or to a specific zone. When an override is applied to a specific zone, the effective limits of other zones are unaffected.