Streaming uploads

Cloud Storage supports streaming data to a bucket without requiring that the data first be saved to a file. This is useful when you want to upload data but don't know the final size at the start of the upload, such as when generating the upload data from a process, or when compressing an object on-the-fly.

Using checksum validation when streaming

Because a checksum can only be supplied in the initial request of an upload, it's often not feasible to use Cloud Storage's checksum validation when streaming. It's recommended that you always use checksum validation, and you can manually do so after a streaming upload completes; however, validating after the transfer completes means that any corrupted data is accessible during the time it takes to confirm the corruption and remove it.

If you require checksum validation prior to the upload completing and the data becoming accessible, then you should not use a streaming upload. You should use a different upload option that performs checksum validation prior to finalizing the object.

Prerequisites

Prerequisites can vary based on the tool used:

Console

In order to complete this guide using the Google Cloud console, you must have the proper IAM permissions. If the bucket you want to access for streaming exists in a project that you did not create, you might need the project owner to give you a role that contains the necessary permissions.

For a list of permissions required for specific actions, see IAM permissions for the Google Cloud console.

For a list of relevant roles, see Cloud Storage roles. Alternatively, you can create a custom role that has specific, limited permissions.

Command line

In order to complete this guide using a command-line utility, you must have the proper IAM permissions. If the bucket you want to access for streaming exists in a project that you did not create, you might need the project owner to give you a role that contains the necessary permissions.

For a list of permissions required for specific actions, see IAM permissions for gcloud storage commands.

For a list of relevant roles, see Cloud Storage roles. Alternatively, you can create a custom role that has specific, limited permissions.

Client libraries

In order to complete this guide using the Cloud Storage client libraries, you must have the proper IAM permissions. If the bucket you want to access for streaming exists in a project that you did not create, you might need the project owner to give you a role that contains the necessary permissions.

Unless otherwise noted, client library requests are made through the JSON API and require permissions as listed in IAM permissions for JSON methods. To see which JSON API methods are invoked when you make requests using a client library, log the raw requests.

For a list of relevant IAM roles, see Cloud Storage roles. Alternatively, you can create a custom role that has specific, limited permissions.

REST APIs

JSON API

In order to complete this guide using the JSON API, you must have the proper IAM permissions. If the bucket you want to access for streaming exists in a project that you did not create, you might need the project owner to give you a role that contains the necessary permissions.

For a list of permissions required for specific actions, see IAM permissions for JSON methods.

For a list of relevant roles, see Cloud Storage roles. Alternatively, you can create a custom role that has specific, limited permissions.

Stream an upload

The following examples show how to perform a streaming upload from a process to a Cloud Storage object:

Console

The Google Cloud console does not support streaming uploads. Use the gcloud CLI instead.

Command line

  1. Pipe the data to the gcloud storage cp command and use a dash for the source URL:

    PROCESS_NAME | gcloud storage cp - gs://BUCKET_NAME/OBJECT_NAME

    Where:

    • PROCESS_NAME is the name of the process from which you are collecting data. For example, collect_measurements.
    • BUCKET_NAME is the name of the bucket containing the object. For example, my_app_bucket.
    • OBJECT_NAME is the name of the object that is created from the data. For example, data_measurements.

Client libraries

C++

For more information, see the Cloud Storage C++ API reference documentation.

To authenticate to Cloud Storage, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

namespace gcs = ::google::cloud::storage;
using ::google::cloud::StatusOr;
[](gcs::Client client, std::string const& bucket_name,
   std::string const& object_name, int desired_line_count) {
  std::string const text = "Lorem ipsum dolor sit amet";
  gcs::ObjectWriteStream stream =
      client.WriteObject(bucket_name, object_name);

  for (int lineno = 0; lineno != desired_line_count; ++lineno) {
    // Add 1 to the counter, because it is conventional to number lines
    // starting at 1.
    stream << (lineno + 1) << ": " << text << "\n";
  }

  stream.Close();

  StatusOr<gcs::ObjectMetadata> metadata = std::move(stream).metadata();
  if (!metadata) throw std::move(metadata).status();
  std::cout << "Successfully wrote to object " << metadata->name()
            << " its size is: " << metadata->size()
            << "\nFull metadata: " << *metadata << "\n";
}

C#

For more information, see the Cloud Storage C# API reference documentation.

To authenticate to Cloud Storage, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.


using Google.Cloud.Storage.V1;
using System;
using System.IO;

public class UploadFileSample
{
    public void UploadFile(
        string bucketName = "your-unique-bucket-name",
        string localPath = "my-local-path/my-file-name",
        string objectName = "my-file-name")
    {
        var storage = StorageClient.Create();
        using var fileStream = File.OpenRead(localPath);
        storage.UploadObject(bucketName, objectName, null, fileStream);
        Console.WriteLine($"Uploaded {objectName}.");
    }
}

Go

For more information, see the Cloud Storage Go API reference documentation.

To authenticate to Cloud Storage, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

import (
	"bytes"
	"context"
	"fmt"
	"io"
	"time"

	"cloud.google.com/go/storage"
)

// streamFileUpload uploads an object via a stream.
func streamFileUpload(w io.Writer, bucket, object string) error {
	// bucket := "bucket-name"
	// object := "object-name"
	ctx := context.Background()
	client, err := storage.NewClient(ctx)
	if err != nil {
		return fmt.Errorf("storage.NewClient: %w", err)
	}
	defer client.Close()

	b := []byte("Hello world.")
	buf := bytes.NewBuffer(b)

	ctx, cancel := context.WithTimeout(ctx, time.Second*50)
	defer cancel()

	// Upload an object with storage.Writer.
	wc := client.Bucket(bucket).Object(object).NewWriter(ctx)
	wc.ChunkSize = 0 // note retries are not supported for chunk size 0.

	if _, err = io.Copy(wc, buf); err != nil {
		return fmt.Errorf("io.Copy: %w", err)
	}
	// Data can continue to be added to the file until the writer is closed.
	if err := wc.Close(); err != nil {
		return fmt.Errorf("Writer.Close: %w", err)
	}
	fmt.Fprintf(w, "%v uploaded to %v.\n", object, bucket)

	return nil
}

Java

For more information, see the Cloud Storage Java API reference documentation.

To authenticate to Cloud Storage, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.


import com.google.cloud.WriteChannel;
import com.google.cloud.storage.BlobId;
import com.google.cloud.storage.BlobInfo;
import com.google.cloud.storage.Storage;
import com.google.cloud.storage.StorageOptions;
import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.charset.StandardCharsets;

public class StreamObjectUpload {

  public static void streamObjectUpload(
      String projectId, String bucketName, String objectName, String contents) throws IOException {
    // The ID of your GCP project
    // String projectId = "your-project-id";

    // The ID of your GCS bucket
    // String bucketName = "your-unique-bucket-name";

    // The ID of your GCS object
    // String objectName = "your-object-name";

    // The string of contents you wish to upload
    // String contents = "Hello world!";

    Storage storage = StorageOptions.newBuilder().setProjectId(projectId).build().getService();
    BlobId blobId = BlobId.of(bucketName, objectName);
    BlobInfo blobInfo = BlobInfo.newBuilder(blobId).build();
    byte[] content = contents.getBytes(StandardCharsets.UTF_8);
    try (WriteChannel writer = storage.writer(blobInfo)) {
      writer.write(ByteBuffer.wrap(content));
      System.out.println(
          "Wrote to " + objectName + " in bucket " + bucketName + " using a WriteChannel.");
    }
  }
}

Node.js

For more information, see the Cloud Storage Node.js API reference documentation.

To authenticate to Cloud Storage, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

/**
 * TODO(developer): Uncomment the following lines before running the sample
 */
// The ID of your GCS bucket
// const bucketName = 'your-unique-bucket-name';

// The new ID for your GCS file
// const destFileName = 'your-new-file-name';

// The content to be uploaded in the GCS file
// const contents = 'your file content';

// Imports the Google Cloud client library
const {Storage} = require('@google-cloud/storage');

// Import Node.js stream
const stream = require('stream');

// Creates a client
const storage = new Storage();

// Get a reference to the bucket
const myBucket = storage.bucket(bucketName);

// Create a reference to a file object
const file = myBucket.file(destFileName);

// Create a pass through stream from a string
const passthroughStream = new stream.PassThrough();
passthroughStream.write(contents);
passthroughStream.end();

async function streamFileUpload() {
  passthroughStream.pipe(file.createWriteStream()).on('finish', () => {
    // The file upload is complete
  });

  console.log(`${destFileName} uploaded to ${bucketName}`);
}

streamFileUpload().catch(console.error);

PHP

For more information, see the Cloud Storage PHP API reference documentation.

To authenticate to Cloud Storage, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

use Google\Cloud\Storage\StorageClient;
use Google\Cloud\Storage\WriteStream;

/**
 * Upload a chunked file stream.
 *
 * @param string $bucketName The name of your Cloud Storage bucket.
 *        (e.g. 'my-bucket')
 * @param string $objectName The name of your Cloud Storage object.
 *        (e.g. 'my-object')
 * @param string $contents The contents to upload via stream chunks.
 *        (e.g. 'these are my contents')
 */
function upload_object_stream(string $bucketName, string $objectName, string $contents): void
{
    $storage = new StorageClient();
    $bucket = $storage->bucket($bucketName);
    $writeStream = new WriteStream(null, [
        'chunkSize' => 1024 * 256, // 256KB
    ]);
    $uploader = $bucket->getStreamableUploader($writeStream, [
        'name' => $objectName,
    ]);
    $writeStream->setUploader($uploader);
    $stream = fopen('data://text/plain,' . $contents, 'r');
    while (($line = stream_get_line($stream, 1024 * 256)) !== false) {
        $writeStream->write($line);
    }
    $writeStream->close();

    printf('Uploaded %s to gs://%s/%s' . PHP_EOL, $contents, $bucketName, $objectName);
}

Python

For more information, see the Cloud Storage Python API reference documentation.

To authenticate to Cloud Storage, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.

from google.cloud import storage


def upload_blob_from_stream(bucket_name, file_obj, destination_blob_name):
    """Uploads bytes from a stream or other file-like object to a blob."""
    # The ID of your GCS bucket
    # bucket_name = "your-bucket-name"

    # The stream or file (file-like object) from which to read
    # import io
    # file_obj = io.BytesIO()
    # file_obj.write(b"This is test data.")

    # The desired name of the uploaded GCS object (blob)
    # destination_blob_name = "storage-object-name"

    # Construct a client-side representation of the blob.
    storage_client = storage.Client()
    bucket = storage_client.bucket(bucket_name)
    blob = bucket.blob(destination_blob_name)

    # Rewind the stream to the beginning. This step can be omitted if the input
    # stream will always be at a correct position.
    file_obj.seek(0)

    # Upload data from the stream to your bucket.
    blob.upload_from_file(file_obj)

    print(
        f"Stream data uploaded to {destination_blob_name} in bucket {bucket_name}."
    )

Ruby

For more information, see the Cloud Storage Ruby API reference documentation.

To authenticate to Cloud Storage, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.


# The ID of your GCS bucket
# bucket_name = "your-unique-bucket-name"

# The stream or file (file-like object) from which to read
# local_file_obj = StringIO.new "This is test data."

# Name of a file in the Storage bucket
# file_name   = "some_file.txt"

require "google/cloud/storage"

storage = Google::Cloud::Storage.new
bucket  = storage.bucket bucket_name

local_file_obj.rewind
bucket.create_file local_file_obj, file_name

puts "Stream data uploaded to #{file_name} in bucket #{bucket_name}"

REST APIs

JSON API

To perform a streaming upload, follow the instructions for performing a resumable upload with the following considerations:

  • When uploading the file data itself, use a multiple chunk upload.

  • Since you don't know the total file size until you get to the final chunk, use a * for the total file size in the Content-Range header of intermediate chunks.

    For example, if the first chunk you upload has a size of 512 KiB, the Content-Range header for the chunk is bytes 0-524287/*. If your upload has 64000 bytes remaining after the first chunk, you then send a final chunk that contains the remaining bytes and has a Content-Range header with the value bytes 524288-588287/588288.

XML API

To perform a streaming upload, use one of the following methods:

  • An XML API multipart upload

  • A resumable upload, with the following adjustments:

    • When uploading the file data itself, use a multiple chunk upload.

    • Since you don't know the total file size until you get to the final chunk, use a * for the total file size in the Content-Range header of intermediate chunks.

      For example, if the first chunk you upload has a size of 512 KiB, the Content-Range header for the chunk is bytes 0-524287/*. If your upload has 64000 bytes remaining after the first chunk, you then send a final chunk that contains the remaining bytes and has a Content-Range header with the value bytes 524288-588287/588288.

What's next