Request Rate and Access Distribution Guidelines

Google Cloud Storage is a highly scalable service that uses auto-scaling technology to achieve very high request rates. This page lays out guidelines for optimizing the scaling and performance that Cloud Storage provides.


Cloud Storage is a multi-tenant service, meaning that users share the same set of underlying resources. In order to make the best use of these shared resources, buckets have an initial IO capacity of around 1000 write requests per second and 5000 read requests per second, which average to 2.5PB written and 13PB read in a month for 1MB objects. As the request rate for a given bucket grows, Cloud Storage automatically increases the IO capacity for that bucket by distributing the request load across multiple servers.

Load redistribution time

As a bucket approaches its IO capacity limit, Cloud Storage typically takes on the order of minutes to detect and accordingly redistribute the load across more servers. Consequently, if the request rate on your bucket increases faster than Cloud Storage can perform this redistribution, you may run into temporary limits, specifically higher latency and error rates. Ramping up the request rate gradually for your buckets, as described below, avoids such latency and errors.

Object key indexing

Cloud Storage supports consistent object listing, which enables users to run data processing workflows easily against Cloud Storage. In order to provide consistent object listing, Cloud Storage maintains an index of object keys for each bucket. This index is stored in lexicographical order and is updated whenever objects are written to or deleted from a bucket. Adding and deleting objects whose keys all exist in a small range of the index naturally increases the chances of contention.

Cloud Storage detects such contention and automatically redistributes the load on the affected index range across multiple servers. Similar to scaling a bucket's IO capcity, when accessing a new range of the index, such as when writing objects under a new prefix, you should ramp up the request rate gradually, as described below. Not doing so may result in temporarily higher latency and error rates.

Best practices

The following sections provide best practices on how to ramp up the request rate, choose object keys, and distribute requests in order to avoid temporary limits on your bucket.

Ramp up request rate gradually

To ensure that Cloud Storage auto-scaling always provides the best performance, you should ramp up your request rate gradually for any bucket that hasn’t had a high request rate in several weeks or that has a new range of object keys. If your request rate is less than 1000 write requests per second or 5000 read requests per second, then no ramp-up is needed. If your request rate is expected to go over these thresholds, you should start with a request rate below or near the thresholds and then double the request rate no faster than every 20 minutes.

If you run into any issues such as increased latency or error rates, pause your ramp-up or reduce the request rate temporarily in order to give Cloud Storage more time to scale your bucket. You should use exponential backoff to retry your requests when receiving errors with 5xx or 429 response codes from GCS.

Use a naming convention that distributes load evenly across key ranges

Auto-scaling of an index range can be slowed when using sequential names, such as object keys based on a sequence of numbers or timestamp. This occurs because requests are constantly shifting to a new index range, making redistributing the load harder and less effective.

In order to maintain a high request rate, avoid using sequential names. Using completely random object names will give you the best load distribution. If you want to use sequential numbers or timestamps as part of your object names, introduce randomness to the object names by adding a hash value before the sequence number or timestamp.

For example, if the original object names you want to use are:


You can compute the MD5 hash of the original object name and add the first 6 characters of the hash as a prefix to the object name. The new object names become:


Note that the random string doesn’t necessarily need to be at the beginning of the object name. Adding a random string after a common prefix still allows auto-scaling to work, but the effect is limited to that prefix, with no consideration of the rest of the bucket.

For example:


The above naming allows for efficient auto-scaling of objects in images/animals and images/landscape, but not images/clouds.

Reorder bulk operations to distribute load evenly across key ranges

Sometimes you'll want to perform a bulk upload or deletion of data in Cloud Storage. In both cases, you may not have control over the object names. Nevertheless, you often want to achieve the highest write or deletion rate possible.

Even if you are not able to choose the object names, you can control the order in which the objects are uploaded or deleted to achieve the same effect as using random names.

For example, say you want to upload a large, 2-dimensional data set to Cloud Storage. The files are named data_file_[x]_[y], where the values of x and y range from "00000" to "99999." The following logic avoids uploading the files in sequential order by looping through the earlier part of the file name for each value of the later part.

for y in '00000' .. '99999':
  for x in '00000' .. '99999':

Note that in a real application, you should also upload the files in parallel to maximize efficiency.

Similarly, you should distribute the uploads or deletes across multiple prefixes. For example, if you have many folders and many files under each folder to upload, a good strategy is to upload from multiple folders in parallel and randomly choose which folders and files are uploaded. Doing so allows the system to distribute the load more evenly across entire key ranges, which allows you to achieve a high request rate after the initial ramp-up.

Send feedback about...

Cloud Storage Documentation