Optimize Cloud Storage Upload Performance with Client Libraries
Sameena Shaffeeullah
Developer Relations Engineer
Optimize Cloud Storage Upload Performance with Client Libraries
Resumable and multipart uploads are different ways of sending data to Cloud Storage, each with their own advantages. The type and configuration of uploads has speed, memory, and retry-related impacts. There are three things you can do to optimize your upload performance:
1. Choose resumable uploads for larger file sizes
Resumable uploads let you efficiently upload large files by sending data in smaller parts, also called "chunks". Resumable uploads require an additional request to initiate the upload, so they are less efficient for uploading smaller files.
All client libraries default to either JSON multipart or resumable uploads, except for C++ and Java, which have separate APIs for each type of upload. For example, Node.js and Ruby always default to resumable uploads, while C#, Go, PHP, and Python switch between resumable or multipart uploads, depending on the size of the file you upload. For some languages, you can override the default behavior and force a resumable upload. You can learn more about the specific behaviors by viewing the resumable uploads documentation.
2. If you’re performing a resumable upload, make sure you’re using an appropriate chunk size
Simplified representation of a chunked resumable upload
For resumable uploads, the "chunk size" is the maximum size of data that can be sent in a single request. Some languages automatically specify a chunk size that you can override, while for others you must specify a chunk size yourself. The chunk size affects the performance of a resumable upload, where larger chunk sizes typically make uploads quicker, but there's a tradeoff between speed and memory usage. In a chunked upload, if the request for a given chunk fails, only that chunk will be retried by the library. This is a lot faster than sending all your data in one chunk and retrying the whole file.
3. Set preconditions to allow the library to automatically retry
Most client libraries only provide automatic retries for upload requests that are idempotent. By default, upload requests aren't idempotent, but you can make the request idempotent by including the ifGenerationMatch or x-goog-if-generation-match precondition in the request, so the client library can retry transient errors on your behalf.
Alternatively, client libraries also support a “retry always” escape hatch option which ignores the idempotency of a request before retrying on transient errors. However, be cautious when enabling "retry always", as it can corrupt your data or metadata.
What’s Next
Explore the chunked upload behavior for specific languages in client libraries.
See code samples for different methods of uploading, such as streaming transfers or uploading from a file or from memory.
Learn more about retries in the client libraries.
Understand how preconditions work for Cloud Storage.