Streaming transfers

Cloud Storage supports streaming transfers, which allow you to stream data to and from your Cloud Storage account without requiring that the data first be saved to a file. Streaming is useful when:

  • You want to upload data but don't know the final size at the start of the upload, such as when generating the upload data from a process, or when compressing an object on-the-fly.

  • You want to download data from Cloud Storage into a process.

Using checksum validation when streaming

Because a checksum can only be supplied in the initial request of an upload, it's often not feasible to utilize Cloud Storage's checksum validation when streaming. It's recommended that you always use checksum validation, and you can manually do so after a streaming upload completes; however, validating after the transfer completes means that any corrupted data is accessible during the time it takes to confirm the corruption and remove it.

If you require checksum validation prior to the upload completing and the data becoming accessible, then you should not use a streaming upload. You should use a different upload option that performs checksum validation prior to finalizing the object.

Similarly, you should not use a streaming download if you require checksum validation prior to the download completing and the data becoming accessible. This is because streaming downloads use the Range header, and Cloud Storage does not perform checksum validation on such requests.

Prerequisites

Prerequisites can vary based on the tool used:

Console

In order to complete this guide using the Google Cloud console, you must have the proper IAM permissions. If the bucket you want to access for streaming exists in a project that you did not create, you might need the project owner to give you a role that contains the necessary permissions.

For a list of permissions required for specific actions, see IAM permissions for the Google Cloud console.

For a list of relevant roles, see Cloud Storage roles. Alternatively, you can create a custom role that has specific, limited permissions.

Command line

In order to complete this guide using a command-line utility, you must have the proper IAM permissions. If the bucket you want to access for streaming exists in a project that you did not create, you might need the project owner to give you a role that contains the necessary permissions.

For a list of permissions required for specific actions, see IAM permissions for gsutil commands.

For a list of relevant roles, see Cloud Storage roles. Alternatively, you can create a custom role that has specific, limited permissions.

Code samples

In order to complete this guide using the Cloud Storage client libraries, you must have the proper IAM permissions. If the bucket you want to access for streaming exists in a project that you did not create, you might need the project owner to give you a role that contains the necessary permissions. Unless otherwise noted, client library requests are made through the JSON API.

For a list of permissions required for specific actions, see IAM permissions for JSON methods.

For a list of relevant roles, see Cloud Storage roles. Alternatively, you can create a custom role that has specific, limited permissions.

REST APIs

JSON API

In order to complete this guide using the JSON API, you must have the proper IAM permissions. If the bucket you want to access for streaming exists in a project that you did not create, you might need the project owner to give you a role that contains the necessary permissions.

For a list of permissions required for specific actions, see IAM permissions for JSON methods.

For a list of relevant roles, see Cloud Storage roles. Alternatively, you can create a custom role that has specific, limited permissions.

Stream an upload

The following examples show how to perform a streaming upload from a process to a Cloud Storage object:

Console

The Google Cloud console does not support streaming uploads. Use gsutil instead.

Command line

  1. Pipe the data to the gsutil cp command and use a dash for the source URL:

    PROCESS_NAME | gsutil cp - gs://BUCKET_NAME/OBJECT_NAME

    Where:

    • PROCESS_NAME is the name of the process from which you are collecting data. For example, collect_measurements.
    • BUCKET_NAME is the name of the bucket containing the object. For example, my_app_bucket.
    • OBJECT_NAME is the name of the object that is created from the data. For example, data_measurements.

Code samples

C++

For more information, see the Cloud Storage C++ API reference documentation.

namespace gcs = ::google::cloud::storage;
using ::google::cloud::StatusOr;
[](gcs::Client client, std::string const& bucket_name,
   std::string const& object_name, int desired_line_count) {
  std::string const text = "Lorem ipsum dolor sit amet";
  gcs::ObjectWriteStream stream =
      client.WriteObject(bucket_name, object_name);

  for (int lineno = 0; lineno != desired_line_count; ++lineno) {
    // Add 1 to the counter, because it is conventional to number lines
    // starting at 1.
    stream << (lineno + 1) << ": " << text << "\n";
  }

  stream.Close();

  StatusOr<gcs::ObjectMetadata> metadata = std::move(stream).metadata();
  if (!metadata) throw std::runtime_error(metadata.status().message());
  std::cout << "Successfully wrote to object " << metadata->name()
            << " its size is: " << metadata->size()
            << "\nFull metadata: " << *metadata << "\n";
}

C#

For more information, see the Cloud Storage C# API reference documentation.


using Google.Cloud.Storage.V1;
using System;
using System.IO;

public class UploadFileSample
{
    public void UploadFile(
        string bucketName = "your-unique-bucket-name",
        string localPath = "my-local-path/my-file-name",
        string objectName = "my-file-name")
    {
        var storage = StorageClient.Create();
        using var fileStream = File.OpenRead(localPath);
        storage.UploadObject(bucketName, objectName, null, fileStream);
        Console.WriteLine($"Uploaded {objectName}.");
    }
}

Go

For more information, see the Cloud Storage Go API reference documentation.

import (
	"bytes"
	"context"
	"fmt"
	"io"
	"time"

	"cloud.google.com/go/storage"
)

// streamFileUpload uploads an object via a stream.
func streamFileUpload(w io.Writer, bucket, object string) error {
	// bucket := "bucket-name"
	// object := "object-name"
	ctx := context.Background()
	client, err := storage.NewClient(ctx)
	if err != nil {
		return fmt.Errorf("storage.NewClient: %v", err)
	}
	defer client.Close()

	b := []byte("Hello world.")
	buf := bytes.NewBuffer(b)

	ctx, cancel := context.WithTimeout(ctx, time.Second*50)
	defer cancel()

	// Upload an object with storage.Writer.
	wc := client.Bucket(bucket).Object(object).NewWriter(ctx)
	wc.ChunkSize = 0 // note retries are not supported for chunk size 0.

	if _, err = io.Copy(wc, buf); err != nil {
		return fmt.Errorf("io.Copy: %v", err)
	}
	// Data can continue to be added to the file until the writer is closed.
	if err := wc.Close(); err != nil {
		return fmt.Errorf("Writer.Close: %v", err)
	}
	fmt.Fprintf(w, "%v uploaded to %v.\n", object, bucket)

	return nil
}

Java

For more information, see the Cloud Storage Java API reference documentation.


import com.google.cloud.WriteChannel;
import com.google.cloud.storage.BlobId;
import com.google.cloud.storage.BlobInfo;
import com.google.cloud.storage.Storage;
import com.google.cloud.storage.StorageOptions;
import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.charset.StandardCharsets;

public class StreamObjectUpload {

  public static void streamObjectUpload(
      String projectId, String bucketName, String objectName, String contents) throws IOException {
    // The ID of your GCP project
    // String projectId = "your-project-id";

    // The ID of your GCS bucket
    // String bucketName = "your-unique-bucket-name";

    // The ID of your GCS object
    // String objectName = "your-object-name";

    // The string of contents you wish to upload
    // String contents = "Hello world!";

    Storage storage = StorageOptions.newBuilder().setProjectId(projectId).build().getService();
    BlobId blobId = BlobId.of(bucketName, objectName);
    BlobInfo blobInfo = BlobInfo.newBuilder(blobId).build();
    byte[] content = contents.getBytes(StandardCharsets.UTF_8);
    try (WriteChannel writer = storage.writer(blobInfo)) {
      writer.write(ByteBuffer.wrap(content));
      System.out.println(
          "Wrote to " + objectName + " in bucket " + bucketName + " using a WriteChannel.");
    }
  }
}

Node.js

For more information, see the Cloud Storage Node.js API reference documentation.

/**
 * TODO(developer): Uncomment the following lines before running the sample
 */
// The ID of your GCS bucket
// const bucketName = 'your-unique-bucket-name';

// The new ID for your GCS file
// const destFileName = 'your-new-file-name';

// Imports the Google Cloud client library
const {Storage} = require('@google-cloud/storage');

// Import Node.js stream
const stream = require('stream');

// Creates a client
const storage = new Storage();

// Get a reference to the bucket
const myBucket = storage.bucket(bucketName);

// Create a reference to a file object
const file = myBucket.file(destFileName);

// Create a pass through stream from a string
const passthroughStream = new stream.PassThrough();
passthroughStream.write('input text');
passthroughStream.end();

async function streamFileUpload() {
  passthroughStream.pipe(file.createWriteStream()).on('finish', () => {
    // The file upload is complete
  });

  console.log(`${destFileName} uploaded to ${bucketName}`);
}

streamFileUpload().catch(console.error);

PHP

For more information, see the Cloud Storage PHP API reference documentation.

use Google\Cloud\Storage\StorageClient;
use Google\Cloud\Storage\WriteStream;

/**
 * Upload a chunked file stream.
 *
 * @param string $bucketName The name of your Cloud Storage bucket.
 * @param string $objectName The name of your Cloud Storage object.
 * @param string $contents The contents to upload via stream chunks.
 */
function upload_object_stream(string $bucketName, string $objectName, string $contents): void
{
    // $bucketName = 'my-bucket';
    // $objectName = 'my-object';
    // $contents = 'these are my contents';

    $storage = new StorageClient();
    $bucket = $storage->bucket($bucketName);
    $writeStream = new WriteStream(null, [
        'chunkSize' => 1024 * 256, // 256KB
    ]);
    $uploader = $bucket->getStreamableUploader($writeStream, [
        'name' => $objectName,
    ]);
    $writeStream->setUploader($uploader);
    $stream = fopen('data://text/plain,' . $contents, 'r');
    while (($line = stream_get_line($stream, 1024 * 256)) !== false) {
        $writeStream->write($line);
    }
    $writeStream->close();

    printf('Uploaded %s to gs://%s/%s' . PHP_EOL, $contents, $bucketName, $objectName);
}

Python

For more information, see the Cloud Storage Python API reference documentation.

from google.cloud import storage


def upload_blob_from_stream(bucket_name, file_obj, destination_blob_name):
    """Uploads bytes from a stream or other file-like object to a blob."""
    # The ID of your GCS bucket
    # bucket_name = "your-bucket-name"

    # The stream or file (file-like object) from which to read
    # import io
    # file_obj = io.StringIO()
    # file_obj.write("This is test data.")

    # The desired name of the uploaded GCS object (blob)
    # destination_blob_name = "storage-object-name"

    # Construct a client-side representation of the blob.
    storage_client = storage.Client()
    bucket = storage_client.bucket(bucket_name)
    blob = bucket.blob(destination_blob_name)

    # Rewind the stream to the beginning. This step can be omitted if the input
    # stream will always be at a correct position.
    file_obj.seek(0)

    # Upload data from the stream to your bucket.
    blob.upload_from_file(file_obj)

    print(
        f"Stream data uploaded to {destination_blob_name} in bucket {bucket_name}."
    )

Ruby

For more information, see the Cloud Storage Ruby API reference documentation.


# The ID of your GCS bucket
# bucket_name = "your-unique-bucket-name"

# The stream or file (file-like object) from which to read
# local_file_obj = StringIO.new "This is test data."

# Name of a file in the Storage bucket
# file_name   = "some_file.txt"

require "google/cloud/storage"

storage = Google::Cloud::Storage.new
bucket  = storage.bucket bucket_name

local_file_obj.rewind
bucket.create_file local_file_obj, file_name

puts "Stream data uploaded to #{file_name} in bucket #{bucket_name}"

REST APIs

JSON API

To perform a streaming upload, follow the instructions for performing a resumable upload with the following considerations:

  • When uploading the file data itself, use a multiple chunk upload.

  • Since you don't know the total file size until you get to the final chunk, use a * for the total file size in the Content-Range header of intermediate chunks.

    For example, if the first chunk you upload has a size of 512 KiB, the Content-Range header for the chunk is bytes 0-524287/*. If your upload has 64000 bytes remaining after the first chunk, you then send a final chunk that contains the remaining bytes and has a Content-Range header with the value bytes 524288-588287/588288.

XML API

To perform a streaming upload, use one of the following methods:

  • An XML API multipart upload

  • A resumable upload, with the following adjustments:

    • When uploading the file data itself, use a multiple chunk upload.

    • Since you don't know the total file size until you get to the final chunk, use a * for the total file size in the Content-Range header of intermediate chunks.

      For example, if the first chunk you upload has a size of 512 KiB, the Content-Range header for the chunk is bytes 0-524287/*. If your upload has 64000 bytes remaining after the first chunk, you then send a final chunk that contains the remaining bytes and has a Content-Range header with the value bytes 524288-588287/588288.

Stream a download

The following examples show how to perform a download from a Cloud Storage object to a process:

Console

The Google Cloud console does not support streaming downloads. Use gsutil instead.

Command line

  1. Run the gsutil cp command using a dash for the destination URL, then pipe the data to the process:

    gsutil cp gs://BUCKET_NAME/OBJECT_NAME - | PROCESS_NAME

    Where:

    • BUCKET_NAME is the name of the bucket containing the object. For example, my_app_bucket.
    • OBJECT_NAME is the name of the object that you are streaming to the process. For example, data_measurements.
    • PROCESS_NAME is the name of the process into which you are feeding data. For example, analyze_data.

You can also stream data from a Cloud Storage object to a standard Linux command like sort:

gsutil cp gs://my_app_bucket/data_measurements - | sort

Code samples

C++

For more information, see the Cloud Storage C++ API reference documentation.

namespace gcs = ::google::cloud::storage;
[](gcs::Client client, std::string const& bucket_name,
   std::string const& object_name) {
  gcs::ObjectReadStream stream = client.ReadObject(bucket_name, object_name);

  int count = 0;
  std::string line;
  while (std::getline(stream, line, '\n')) {
    ++count;
  }

  std::cout << "The object has " << count << " lines\n";
}

C#

For more information, see the Cloud Storage C# API reference documentation.


using Google.Cloud.Storage.V1;
using System;
using System.IO;

public class DownloadFileSample
{
    public void DownloadFile(
        string bucketName = "your-unique-bucket-name",
        string objectName = "my-file-name",
        string localPath = "my-local-path/my-file-name")
    {
        var storage = StorageClient.Create();
        using var outputFile = File.OpenWrite(localPath);
        storage.DownloadObject(bucketName, objectName, outputFile);
        Console.WriteLine($"Downloaded {objectName} to {localPath}.");
    }
}

Go

For more information, see the Cloud Storage Go API reference documentation.


import (
	"context"
	"fmt"
	"io"
	"io/ioutil"
	"time"

	"cloud.google.com/go/storage"
)

// downloadFileIntoMemory downloads an object.
func downloadFileIntoMemory(w io.Writer, bucket, object string) ([]byte, error) {
	// bucket := "bucket-name"
	// object := "object-name"
	ctx := context.Background()
	client, err := storage.NewClient(ctx)
	if err != nil {
		return nil, fmt.Errorf("storage.NewClient: %v", err)
	}
	defer client.Close()

	ctx, cancel := context.WithTimeout(ctx, time.Second*50)
	defer cancel()

	rc, err := client.Bucket(bucket).Object(object).NewReader(ctx)
	if err != nil {
		return nil, fmt.Errorf("Object(%q).NewReader: %v", object, err)
	}
	defer rc.Close()

	data, err := ioutil.ReadAll(rc)
	if err != nil {
		return nil, fmt.Errorf("ioutil.ReadAll: %v", err)
	}
	fmt.Fprintf(w, "Blob %v downloaded.\n", object)
	return data, nil
}

Java

For more information, see the Cloud Storage Java API reference documentation.


import com.google.cloud.ReadChannel;
import com.google.cloud.storage.BlobId;
import com.google.cloud.storage.Storage;
import com.google.cloud.storage.StorageOptions;
import com.google.common.io.ByteStreams;
import java.io.IOException;
import java.nio.channels.FileChannel;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.nio.file.StandardOpenOption;

public class StreamObjectDownload {

  public static void streamObjectDownload(
      String projectId, String bucketName, String objectName, String targetFile)
      throws IOException {
    // The ID of your GCP project
    // String projectId = "your-project-id";

    // The ID of your GCS bucket
    // String bucketName = "your-unique-bucket-name";

    // The ID of your GCS object
    // String objectName = "your-object-name";

    // The path to the file to download the object to
    // String targetFile = "path/to/your/file";
    Path targetFilePath = Paths.get(targetFile);

    Storage storage = StorageOptions.newBuilder().setProjectId(projectId).build().getService();
    try (ReadChannel reader = storage.reader(BlobId.of(bucketName, objectName));
        FileChannel targetFileChannel =
            FileChannel.open(targetFilePath, StandardOpenOption.WRITE)) {

      ByteStreams.copy(reader, targetFileChannel);

      System.out.println(
          "Downloaded object "
              + objectName
              + " from bucket "
              + bucketName
              + " to "
              + targetFile
              + " using a ReadChannel.");
    }
  }
}

Node.js

For more information, see the Cloud Storage Node.js API reference documentation.

/**
 * TODO(developer): Uncomment the following lines before running the sample.
 */
// The ID of your GCS bucket
// const bucketName = 'your-unique-bucket-name';

// The ID of your GCS file
// const fileName = 'your-file-name';

// The filename and file path where you want to download the file
// const destFileName = '/local/path/to/file.txt';

// Imports the Google Cloud client library
const {Storage} = require('@google-cloud/storage');

// Creates a client
const storage = new Storage();

async function streamFileDownload() {
  // The example below demonstrates how we can reference a remote file, then
  // pipe its contents to a local file.
  // Once the stream is created, the data can be piped anywhere (process, sdout, etc)
  await storage
    .bucket(bucketName)
    .file(fileName)
    .createReadStream() //stream is created
    .pipe(fs.createWriteStream(destFileName))
    .on('finish', () => {
      // The file download is complete
    });

  console.log(
    `gs://${bucketName}/${fileName} downloaded to ${destFileName}.`
  );
}

streamFileDownload().catch(console.error);

PHP

For more information, see the Cloud Storage PHP API reference documentation.

use Google\Cloud\Storage\StorageClient;

/**
 * Download an object from Cloud Storage and save it as a local file.
 *
 * @param string $bucketName The name of your Cloud Storage bucket.
 * @param string $objectName The name of your Cloud Storage object.
 * @param string $destination The local destination to save the object.
 */
function download_object($bucketName, $objectName, $destination)
{
    // $bucketName = 'my-bucket';
    // $objectName = 'my-object';
    // $destination = '/path/to/your/file';

    $storage = new StorageClient();
    $bucket = $storage->bucket($bucketName);
    $object = $bucket->object($objectName);
    $object->downloadToFile($destination);
    printf(
        'Downloaded gs://%s/%s to %s' . PHP_EOL,
        $bucketName,
        $objectName,
        basename($destination)
    );
}

Python

For more information, see the Cloud Storage Python API reference documentation.

from google.cloud import storage


def download_blob_to_stream(bucket_name, source_blob_name, file_obj):
    """Downloads a blob to a stream or other file-like object."""

    # The ID of your GCS bucket
    # bucket_name = "your-bucket-name"

    # The ID of your GCS object (blob)
    # source_blob_name = "storage-object-name"

    # The stream or file (file-like object) to which the blob will be written
    # import io
    # file_obj = io.BytesIO()

    storage_client = storage.Client()

    bucket = storage_client.bucket(bucket_name)

    # Construct a client-side representation of a blob.
    # Note `Bucket.blob` differs from `Bucket.get_blob` in that it doesn't
    # retrieve metadata from Google Cloud Storage. As we don't use metadata in
    # this example, using `Bucket.blob` is preferred here.
    blob = bucket.blob(source_blob_name)
    blob.download_to_file(file_obj)

    print(f"Downloaded blob {source_blob_name} to file-like object.")

    return file_obj
    # Before reading from file_obj, remember to rewind with file_obj.seek(0).

Ruby

For more information, see the Cloud Storage Ruby API reference documentation.

# Downloads a blob to a stream or other file-like object.

# The ID of your GCS bucket
# bucket_name = "your-unique-bucket-name"

# Name of a file in the Storage bucket
# file_name   = "some_file.txt"

# The stream or file (file-like object) to which the contents will be written
# local_file_obj = StringIO.new

require "google/cloud/storage"

storage = Google::Cloud::Storage.new
bucket  = storage.bucket bucket_name
file    = bucket.file file_name

file.download local_file_obj, verify: :none

# rewind the object before starting to read the downloaded contents
local_file_obj.rewind
puts "The full downloaded file contents are: #{local_file_obj.read.inspect}"

REST APIs

JSON API

To perform a streaming download, follow the instructions for downloading an object with the following considerations:

  • Before beginning the download, retrieve the object's metadata and save the object's generation number. Include this generation number in each of your requests to ensure that you don't download data from two different generations in the event the original gets overwritten.

  • Use the Range header in your request to retrieve a piece of the overall object, which you can send to the desired local process.

  • Continue making requests for successive pieces of the object, until the entire object has been retrieved.

XML API

To perform a streaming download, follow the instructions for downloading an object with the following considerations:

  • Before beginning the download, retrieve the object's metadata and save the object's generation number. Include this generation number in each of your requests to ensure that you don't download data from two different generations in the event the original gets overwritten.

  • Use the Range header in your request to retrieve a piece of the overall object, which you can send to the desired local process.

  • Continue making requests for successive pieces of the object, until the entire object has been retrieved.

What's next