Google Cloud Storage

Google Cloud Storage is an Internet service to store data in Google's cloud. It allows world-wide storage and retrieval of any amount of data and at any time, taking advantage of Google's own reliable and fast networking infrastructure to perform data operations in a cost effective manner.

The goal of google-cloud is to provide an API that is comfortable to Rubyists. Your authentication credentials are detected automatically in Google Cloud Platform (GCP), including Google Compute Engine (GCE), Google Kubernetes Engine (GKE), Google App Engine (GAE), Google Cloud Functions (GCF) and Cloud Run. In other environments you can configure authentication easily, either directly in your code or via environment variables. Read more about the options for connecting in the Authentication Guide.

require "google/cloud/storage"

storage = Google::Cloud::Storage.new(
  project_id: "my-project",
  credentials: "/path/to/keyfile.json"
)

bucket = storage.bucket "my-bucket"
file = bucket.file "path/to/my-file.ext"

To learn more about Cloud Storage, read the Google Cloud Storage Overview .

Retrieving Buckets

A Bucket instance is a container for your data. There is no limit on the number of buckets that you can create in a project. You can use buckets to organize and control access to your data. For more information, see Working with Buckets.

Each bucket has a globally unique name, which is how they are retrieved: (See Project#bucket)

require "google/cloud/storage"

storage = Google::Cloud::Storage.new

bucket = storage.bucket "my-todo-app"

You can also retrieve all buckets on a project: (See Project#buckets)

require "google/cloud/storage"

storage = Google::Cloud::Storage.new

all_buckets = storage.buckets

If you have a significant number of buckets, you may need to fetch them in multiple service requests.

Iterating over each bucket, potentially with multiple API calls, by invoking Bucket::List#all with a block:

require "google/cloud/storage"

storage = Google::Cloud::Storage.new

buckets = storage.buckets
buckets.all do |bucket|
  puts bucket.name
end

Limiting the number of API calls made:

require "google/cloud/storage"

storage = Google::Cloud::Storage.new

buckets = storage.buckets
buckets.all(request_limit: 10) do |bucket|
  puts bucket.name
end

See Bucket::List for details.

Creating a Bucket

A unique name is all that is needed to create a new bucket: (See Project#create_bucket)

require "google/cloud/storage"

storage = Google::Cloud::Storage.new

bucket = storage.create_bucket "my-todo-app-attachments"

Retrieving Files

A File instance is an individual data object that you store in Google Cloud Storage. Files contain the data stored as well as metadata describing the data. Files belong to a bucket and cannot be shared among buckets. There is no limit on the number of files that you can create in a bucket. For more information, see Working with Objects.

Files are retrieved by their name, which is the path of the file in the bucket: (See Bucket#file)

require "google/cloud/storage"

storage = Google::Cloud::Storage.new

bucket = storage.bucket "my-todo-app"
file = bucket.file "avatars/heidi/400x400.png"

You can also retrieve all files in a bucket: (See Bucket#files)

require "google/cloud/storage"

storage = Google::Cloud::Storage.new

bucket = storage.bucket "my-todo-app"
all_files = bucket.files

Or you can retrieve all files in a specified path:

require "google/cloud/storage"

storage = Google::Cloud::Storage.new

bucket = storage.bucket "my-todo-app"
avatar_files = bucket.files prefix: "avatars/"

If you have a significant number of files, you may need to fetch them in multiple service requests.

Iterating over each file, potentially with multiple API calls, by invoking File::List#all with a block:

require "google/cloud/storage"

storage = Google::Cloud::Storage.new
bucket = storage.bucket "my-todo-app"

files = storage.files
files.all do |file|
  puts file.name
end

Limiting the number of API calls made:

require "google/cloud/storage"

storage = Google::Cloud::Storage.new

files = storage.files
files.all(request_limit: 10) do |file|
  puts bucket.name
end

See File::List for details.

Creating a File

A new file can be uploaded by specifying the location of a file on the local file system, and the name/path that the file should be stored in the bucket. (See Bucket#create_file)

require "google/cloud/storage"

storage = Google::Cloud::Storage.new

bucket = storage.bucket "my-todo-app"
bucket.create_file "/var/todo-app/avatars/heidi/400x400.png",
                   "avatars/heidi/400x400.png"

Files can also be created from an in-memory StringIO object:

require "google/cloud/storage"

storage = Google::Cloud::Storage.new

bucket = storage.bucket "my-todo-app"
bucket.create_file StringIO.new("Hello world!"), "hello-world.txt"

Customer-supplied encryption keys

By default, Google Cloud Storage manages server-side encryption keys on your behalf. However, a customer-supplied encryption key can be provided with the encryption_key option. If given, the same key must be provided to subsequently download or copy the file. If you use customer-supplied encryption keys, you must securely manage your keys and ensure that they are not lost. Also, please note that file metadata is not encrypted, with the exception of the CRC32C checksum and MD5 hash. The names of files and buckets are also not encrypted, and you can read or update the metadata of an encrypted file without providing the encryption key.

require "google/cloud/storage"

storage = Google::Cloud::Storage.new
bucket = storage.bucket "my-todo-app"

# Key generation shown for example purposes only. Write your own.
cipher = OpenSSL::Cipher.new "aes-256-cfb"
cipher.encrypt
key = cipher.random_key

bucket.create_file "/var/todo-app/avatars/heidi/400x400.png",
                   "avatars/heidi/400x400.png",
                   encryption_key: key

# Store your key and hash securely for later use.
file = bucket.file "avatars/heidi/400x400.png",
                   encryption_key: key

Use File#rotate to rotate customer-supplied encryption keys.

require "google/cloud/storage"

storage = Google::Cloud::Storage.new
bucket = storage.bucket "my-todo-app"

# Old key was stored securely for later use.
old_key = "y\x03\"\x0E\xB6\xD3\x9B\x0E\xAB*\x19\xFAv\xDEY\xBEI..."

file = bucket.file "path/to/my-file.ext", encryption_key: old_key

# Key generation shown for example purposes only. Write your own.
cipher = OpenSSL::Cipher.new "aes-256-cfb"
cipher.encrypt
new_key = cipher.random_key

file.rotate encryption_key: old_key, new_encryption_key: new_key

Downloading a File

Files can be downloaded to the local file system. (See File#download)

require "google/cloud/storage"

storage = Google::Cloud::Storage.new

bucket = storage.bucket "my-todo-app"
file = bucket.file "avatars/heidi/400x400.png"
file.download "/var/todo-app/avatars/heidi/400x400.png"

Files can also be downloaded to an in-memory StringIO object:

require "google/cloud/storage"

storage = Google::Cloud::Storage.new

bucket = storage.bucket "my-todo-app"
file = bucket.file "hello-world.txt"

downloaded = file.download
downloaded.rewind
downloaded.read #=> "Hello world!"

Download a public file with an anonymous, unauthenticated client. Use skip_lookup to avoid errors retrieving non-public bucket and file metadata.

require "google/cloud/storage"

storage = Google::Cloud::Storage.anonymous

bucket = storage.bucket "public-bucket", skip_lookup: true
file = bucket.file "path/to/public-file.ext", skip_lookup: true

downloaded = file.download
downloaded.rewind
downloaded.read #=> "Hello world!"

Creating and downloading gzip-encoded files

When uploading a gzip-compressed file, you should pass content_encoding: "gzip" if you want the file to be eligible for decompressive transcoding when it is later downloaded. In addition, giving the gzip-compressed file a name containing the original file extension (for example, .txt) will ensure that the file's Content-Type metadata is set correctly. (You can also set the file's Content-Type metadata explicitly with the content_type option.)

require "zlib"
require "google/cloud/storage"

storage = Google::Cloud::Storage.new

gz = StringIO.new ""
z = Zlib::GzipWriter.new gz
z.write "Hello world!"
z.close
data = StringIO.new gz.string

bucket = storage.bucket "my-bucket"

bucket.create_file data, "path/to/gzipped.txt",
                   content_encoding: "gzip"

file = bucket.file "path/to/gzipped.txt"

# The downloaded data is decompressed by default.
file.download "path/to/downloaded/hello.txt"

# The downloaded data remains compressed with skip_decompress.
file.download "path/to/downloaded/gzipped.txt",
              skip_decompress: true

Using Signed URLs

Access without authentication can be granted to a file for a specified period of time. This URL uses a cryptographic signature of your credentials to access the file. (See File#signed_url)

require "google/cloud/storage"

storage = Google::Cloud::Storage.new

bucket = storage.bucket "my-todo-app"
file = bucket.file "avatars/heidi/400x400.png"
shared_url = file.signed_url method: "GET",
                             expires: 300 # 5 minutes from now

Controlling Access to a Bucket

Access to a bucket is controlled with Bucket#acl. A bucket has owners, writers, and readers. Permissions can be granted to an individual user's email address, a group's email address, as well as many predefined lists. See the Access Control guide for more.

Access to a bucket can be granted to a user by appending "user-" to the email address:

require "google/cloud/storage"

storage = Google::Cloud::Storage.new

bucket = storage.bucket "my-todo-app"

email = "heidi@example.net"
bucket.acl.add_reader "user-#{email}"

Access to a bucket can be granted to a group by appending "group-" to the email address:

require "google/cloud/storage"

storage = Google::Cloud::Storage.new

bucket = storage.bucket "my-todo-app"

email = "authors@example.net"
bucket.acl.add_reader "group-#{email}"

Access to a bucket can also be granted to a predefined list of permissions:

require "google/cloud/storage"

storage = Google::Cloud::Storage.new

bucket = storage.bucket "my-todo-app"

bucket.acl.public!

Controlling Access to a File

Access to a file is controlled in two ways, either by the setting the default permissions to all files in a bucket with Bucket#default_acl, or by setting permissions to an individual file with File#acl.

Access to a file can be granted to a user by appending "user-" to the email address:

require "google/cloud/storage"

storage = Google::Cloud::Storage.new

bucket = storage.bucket "my-todo-app"
file = bucket.file "avatars/heidi/400x400.png"

email = "heidi@example.net"
file.acl.add_reader "user-#{email}"

Access to a file can be granted to a group by appending "group-" to the email address:

require "google/cloud/storage"

storage = Google::Cloud::Storage.new

bucket = storage.bucket "my-todo-app"
file = bucket.file "avatars/heidi/400x400.png"

email = "authors@example.net"
file.acl.add_reader "group-#{email}"

Access to a file can also be granted to a predefined list of permissions:

require "google/cloud/storage"

storage = Google::Cloud::Storage.new

bucket = storage.bucket "my-todo-app"
file = bucket.file "avatars/heidi/400x400.png"

file.acl.public!

Assigning payment to the requester

The requester pays feature enables the owner of a bucket to indicate that a client accessing the bucket or a file it contains must assume the transit costs related to the access.

Assign transit costs for bucket and file operations to requesting clients with the requester_pays flag:

require "google/cloud/storage"

storage = Google::Cloud::Storage.new

bucket = storage.bucket "my-bucket"

bucket.requester_pays = true # API call
# Clients must now provide `user_project` option when calling
# Project#bucket to access this bucket.

Once the requester_pays flag is enabled for a bucket, a client attempting to access the bucket and its files must provide the user_project option to Project#bucket. If the argument given is true, transit costs for operations on the requested bucket or a file it contains will be billed to the current project for the client. (See Project#project for the ID of the current project.)

require "google/cloud/storage"

storage = Google::Cloud::Storage.new

bucket = storage.bucket "other-project-bucket", user_project: true

files = bucket.files # Billed to current project

If the argument is a project ID string, and the indicated project is authorized for the currently authenticated service account, transit costs will be billed to the indicated project.

require "google/cloud/storage"

storage = Google::Cloud::Storage.new

bucket = storage.bucket "other-project-bucket",
                        user_project: "my-other-project"
files = bucket.files # Billed to "my-other-project"

Configuring Pub/Sub notification subscriptions

You can configure notifications to send Google Cloud Pub/Sub messages about changes to files in your buckets. For example, you can track files that are created and deleted in your bucket. Each notification contains information describing both the event that triggered it and the file that changed.

You can send notifications to any Cloud Pub/Sub topic in any project for which your service account has sufficient permissions. As shown below, you need to explicitly grant permission to your service account to enable Google Cloud Storage to publish on behalf of your account. (Even if your current project created and owns the topic.)

require "google/cloud/pubsub"
require "google/cloud/storage"

pubsub = Google::Cloud::Pubsub.new
storage = Google::Cloud::Storage.new

topic = pubsub.create_topic "my-topic"
topic.policy do |p|
  p.add "roles/pubsub.publisher",
        "serviceAccount:#{storage.service_account_email}"
end

bucket = storage.bucket "my-bucket"

notification = bucket.create_notification topic.name

Configuring retries and timeout

You can configure how many times API requests may be automatically retried. When an API request fails, the response will be inspected to see if the request meets criteria indicating that it may succeed on retry, such as 500 and 503 status codes or a specific internal error code such as rateLimitExceeded. If it meets the criteria, the request will be retried after a delay. If another error occurs, the delay will be increased before a subsequent attempt, until the retries limit is reached.

You can also set the request timeout value in seconds.

require "google/cloud/storage"

storage = Google::Cloud::Storage.new retries: 10, timeout: 120

The library by default retries all API requests which are always idempotent on a "transient" error.

For API requests which are idempotent only if the some conditions are satisfied (For ex. a file has the same "generation"), the library retries only if the condition is specified.

Rather than using this default behaviour, you may choose to disable the retries on your own.

You can pass retries as 0 to disable retries for all operations regardless of their idempotencies.

require "google/cloud/storage"

storage = Google::Cloud::Storage.new retries: 0

You can also disable retries for a particular operation by passing retries as 0 in the options field.

require "google/cloud/storage"

storage = Google::Cloud::Storage.new
service = storage.service
service.get_bucket bucket_name, options: {retries: 0}

See the Storage status and error codes for a list of error conditions.

Additional information

Google Cloud Storage can be configured to use logging. To learn more, see the Logging guide.