API rate limits

API rate limits define the number of requests that can be made to the Compute Engine API. API rate limits apply on a per-project basis. When you use gcloud compute or the Google Cloud Platform Console, you are also making requests to the API and these requests count towards your API rate limit. If you use service accounts to access the API, that also counts towards your rate limit.

Currently, projects are limited to the following API rate limit categories. Each limit category is counted separately, so you can achieve the maximum limit in each category simultaneously. Rate limits are enforced at intervals of every 100 seconds. For example, 20 requests/second would translate to 2000 requests within 100 seconds. That means that if you reach a specific limit anytime within 100 seconds, you need to wait for your quota bucket to refresh to make more requests.

  • Queries - Limits for all methods except *.get and *.list methods:
    • Rate per project: 20 requests/second
    • Rate per user: 20 requests/second
  • Read requests - Limits for *.get methods:
    • Rate per project: 20 requests/second
    • Rate per user: 20 requests/second
  • List requests - Limits for *.list methods:
    • Rate per project: 20 requests/second
    • Rate per user: 20 requests/second
  • Operation read requests - Limits for *OperationsService.Get methods:
    • Rate per project: 20 requests/second
    • Rate per user: 20 requests/second
  • Heavy-weight read requests - Limits for *.AggregatedList methods:
    • Rate per project: 10 requests/second
    • Rate per user: 10 requests/second
  • Heavy-weight mutation requests Limits for patch, delete, and insert methods for the InterconnectsService and InterconnectAttachmentsService features:
    • Rate per project: 10 requests/second
    • Rate per user: 10 requests/second
  • Instance SimulateMaintenanceEvent requests - Limits for *.SimulateMaintenanceEvent methods:
    • Rate per project: 2 requests/second
    • Rate per user: 2 requests/second
  • License insert requests - Limits for *.LicensesService.Insert methods:
    • Rate per project: 2 requests/second
    • Rate per user: 2 requests/second

If you need a higher rate limit for API requests, you can request an increase via the Google Cloud Platform Console.

Best Practices

Here are some best practices to help you work with API rate limits on Compute Engine.

  • Use bursting sparingly and selectively. Bursting is the act of allowing a specific client make many API requests in a short period of time. Usually, this is done in response to exceptional scenarios, such as cases where your application needs to handle more traffic than usual. Bursting will burn through your API rate limit quickly so make sure you use it only when necessary.

  • Use a client side rate limiter. A client side rate limiter sets an artificial limit so that the client in question can only use a certain amount of quota. This prevents any one client from consuming all your quota.

  • Use exponential backoff to progressively space out requests once you reach your quota. This gives the server time to refill your quota buckets.

  • Avoid short polling, where your clients continuously make requests to the server without waiting for a response. If you short poll, it will be more difficult to catch bad requests that count against your quota, even if they do not return useful data.

  • Split up your applications across multiple projects. Since quotas are applied on a per-project level, you can split up your applications so each application has its own dedicated quota pool.

  • If you receive a 403 error with the error message rateLimitExceeded, wait a few seconds and try your request again. Quota buckets are refilled every 100 seconds so your request should succeed once you have passed that interval.

Оцените, насколько информация на этой странице была вам полезна:

Оставить отзыв о...

Текущей странице
Compute Engine Documentation