This document describes the recommended best practices for using the Compute Engine API and is intended for users who are already familiar with it. If you are a beginner, learn about the prerequisites and using the Compute Engine API.
Following these best practices can help you save time, prevent errors, and mitigate the effects of API rate limits.
Use client libraries
Client libraries are the recommended way of programmatically accessing the Compute Engine API. Client libraries provide code that lets you access the API through common programming languages, which can save you time and improve your code's performance.
Generate REST requests by using the Cloud Console
When creating a resource, generate the REST request using the resource creation pages or details pages in the Google Cloud Console. Using a generated REST request saves time and helps prevent syntax errors.
Learn how to Generate REST requests.
Wait for operations to be done
Don't assume that an operation—any API request that changes a
resource—is complete or successful. Instead, use a
wait method for the
Operation resource to verify that the operation is done. (You don't need to
verify a request that doesn't modify resources—such as a read request
GET HTTP verb—because the API response already indicates if
the request was successful. Consequently, the Compute Engine API does not
Operation resources for these requests.)
Whenever an API request is successfully initiated, it returns an HTTP
200 status code. Although receiving a
200 indicates that the server
received your API request successfully, this status code doesn't indicate
if the requested operation has been completed successfully or not. For example,
you can receive a
200, but the operation might not be complete yet or
the operation might have failed.
Any request to create, update, or delete for a
which captures the status of that request. An operation is done when the
status field of the
Operation resource is
DONE. To check the status,
wait method that matches the
of the returned
- For zonal operations, use
- For regional operations, use
- For global operations, use
wait method returns when the operation is done or when the request is
approaching the 2-minute deadline. When using the
avoid short polling, which is when your clients continuously make requests to
the server without waiting for a response. Using the
wait method without
short polling to check the status of your request, instead of using the
method for the
Operation resource, helps preserve your
API rate limits and reduces latency.
For more information about and examples of using the
wait method, see
Handling API responses.
Paginate list results
When using a
(such as a
*.list method, a
*.aggregatedList method, or any other method
that returns a list), paginate the results whenever possible to ensure that
you read the entire response. If you don't paginate, you can only receive up
to the first 500 elements as determined by the
maxResults query parameter.
For more information about pagination on Google Cloud, see
For specific details and examples, see the reference documentation for the
list method that you want to use, such as
Rely on error codes, not error messages
Google APIs must use the canonical error codes defined by
but error messages
can be subject to change without notice. Error messages are generally intended
for developers to read, not programs.
Learn more about API errors.
Minimize client-side retries to preserve API rate limits
Minimize the number of client-side retries for a project to prevent
rateLimitExceeded errors and to maximize the utilization of your
API rate limits. The following practices
can help you preserve the API rate limits for your projects:
- Avoid short polling.
- Use bursting sparingly and selectively.
- Always make your calls in a retry loop with exponential backoff.
- Use a client-side rate limiter.
- Split your applications across multiple projects.
Avoid short polling
Avoid short polling, where your clients continuously make requests to the server without waiting for a response. If you short poll, it is more difficult to catch bad requests that count against your quota, even if they do not return useful data.
Instead of short polling, you should wait for operations to be done.
Use bursting sparingly and selectively
Use bursting sparingly and selectively. Bursting is the act of allowing a specific client to make many API requests in a short time. Usually, bursting is done in response to exceptional scenarios, such as cases where your application needs to handle more traffic than usual. Bursting burns through your API rate limit quickly so make sure you use it only when necessary.
Learn more about batching requests.
Always make your calls in a retry loop with exponential backoff
Use exponential backoff to progressively space out requests when they timeout or whenever you reach your API rate limit.
While you are waiting for an operation, your request might timeout. For example, waiting on an operation will timeout after the default HTTP timeout (2 minutes) and then return the current state of the operation, which might be DONE or still in progress. If you need to retry an operation after timeout, your retry loop should have an exponential back-off that ensures frequent retries don't overload your application or exceed your API rate limits. Otherwise, you risk negatively impacting all other systems in the same project.
When you reach the API rate limit, use exponential backoff to give the server time to refill your quota buckets. API rate limits are enforced at intervals of every 100 seconds. Whenever you exceed an API rate limit, you need to wait for the interval to end for your quota bucket to refresh then you can make more requests.
For an example of implementing exponential backoff, see the exponential backoff algorithm for the Identity and Access Management API.
Use a client-side rate limiter
Use a client-side rate limiter. A client-side rate limiter sets an artificial limit so that the client in question can only use a certain amount of quota, which prevents any one client from consuming all your quota.
Split up your applications across multiple projects
Splitting up your applications across multiple projects can help minimize the number of requests for your quota buckets. Since quotas are applied on a per-project level, you can split up your applications so each application has its own dedicated quota bucket.
The following checklist summarizes the best practices for using the Compute Engine API.
- Use client libraries
- Generate REST requests by using the Cloud Console
- Wait for operations to be done
- Paginate list results
- Rely on error codes, not error messages
- Minimize client-side retries to preserve API rate limits
- Learn how to improve performance when using the Compute Engine API.