Streaming Transfers

Cloud Storage supports streaming transfers with the gsutil tool or boto library, based on HTTP chunked transfer encoding. Streaming data lets you stream data to and from your Cloud Storage account as soon as it becomes available without requiring that the data be first saved to a separate file. Streaming transfers are useful if you have a process that generates data and you do not want to buffer it locally before uploading it, or if you want to send the result from a computational pipeline directly into Cloud Storage.

For more information on chunked transfer encoding, see RFC 7230 §4.1.

Streaming uploads and downloads using gsutil

To use gsutil to perform a streaming upload, pipeline your data to a gsutil cp command and replace the file to be copied with a dash. The following example shows a process called collect_measurements whose output is being transferred to a Cloud Storage object named data_measurements:

collect_measurements | gsutil cp - gs://my_app_bucket/data_measurements

Similarly, to perform streaming downloads using gsutil, pipeline your data with the gsutil cp command and a dash:

gsutil cp gs://bucket/object - | <process data>

The following example shows the object named data_measurements being streamed and sorted:

gsutil cp gs://my_app_bucket/data_measurements - | sort

Streaming uploads and downloads using boto

To use boto to perform a streaming upload, use the following code:

dst_uri = boto.storage_uri(<bucket> + '/' + <object>, 'gs')
dst_uri.new_key().set_contents_from_stream(<stream object>)

For example, the following code performs a streaming upload of a file named data_file to an object with the same name:

filename = 'data_file'
MY_BUCKET = 'my_app_bucket'
my_data = open(filename, 'rb')
dst_uri = boto.storage_uri(MY_BUCKET + '/' + filename, 'gs')

To perform a streaming download using boto, use the following code:

import sys

src_uri = boto.storage_uri(<bucket> + '/' + <object>, 'gs')

For example, the following code performs a streaming download of an object named data_file:

downloaded_file = 'saved_data_file'
MY_BUCKET = 'my_app_bucket'
object_name = 'data_file'
src_uri = boto.storage_uri(MY_BUCKET + '/' + object_name, 'gs')

Send feedback about...

Cloud Storage Documentation