Accessing public data

Which method you choose to access public data depends on how you want to work with the data. When accessing public data via the Google Cloud Console, you must authenticate with Google. You can authenticate with any Google account; the account does not have to be associated with the project that contains the public data, nor does it need to be signed up for the Cloud Storage service.

By contrast, accessing public data with gsutil or a Cloud Storage API link does not require authentication. These methods are suited for general-purpose links to publicly shared data. For example, an API link can be used in a web page, with client libraries, or with a command-line tool such as cURL.

To access public data:

Accessing this link does not require authentication. It is suitable, for example, as a link in a web page, or for downloading with a command-line tool such as cURL.

  1. Get the name of the bucket containing the public data.

  2. Use the following URI to access an object in the bucket:

    https://storage.googleapis.com/BUCKET_NAME/OBJECT_NAME

For example, the Google public bucket gcp-public-data-landsat contains the Landsat public dataset. You can link to the publicly shared object LC08/PRE/063/046/LC80630462016136LGN00/LC80630462016136LGN00_B11.TIF with the link:

https://storage.googleapis.com/gcp-public-data-landsat/LC08/PRE/063/046/LC80630462016136LGN00/LC80630462016136LGN00_B11.TIF

Console

Accessing this link requires authentication with Google. You should generally use the method described in the API link tab to access links to individual objects in a public bucket. You can only access public objects via the Cloud Console if you have storage.objects.list permission for the bucket that contains the objects.

  1. Get the name of the public bucket.

  2. Using a web browser, access the bucket with the following URI (you will be asked to sign in if necessary):

    https://console.cloud.google.com/storage/browser/BUCKET_NAME

For example, the Google public bucket gcp-public-data-landsat contains the Landsat public dataset. You can access the bucket with:

https://console.cloud.google.com/storage/browser/gcp-public-data-landsat

gsutil

  1. If you don't have gsutil, follow these instructions to install gsutil.

  2. Get the name of the bucket containing the public data.

  3. If the bucket is public (and not just some of the data within it), you can list some or all of the data (objects) contained in the bucket by using the ls command.

    For example, the Google public bucket gcp-public-data-landsat contains the Landsat public dataset. You can list files with the prefix LC08/PRE/063/046/LC80630462016 with the command:

    gsutil ls -r gs://gcp-public-data-landsat/LC08/PRE/063/046/LC80630462016*

  4. Get specific public objects contained in the bucket by using the cp command.

    For example, the following command downloads a file from the bucket gcp-public-data-landsat to your local directory:

    gsutil cp gs://gcp-public-data-landsat/LC08/PRE/063/046/LC80630462016136LGN00/LC80630462016136LGN00_B11.TIF .

Code samples

C++

For more information, see the Cloud Storage C++ API reference documentation.

namespace gcs = google::cloud::storage;
[](std::string const& bucket_name, std::string const& object_name) {
  // Create a client that does not authenticate with the server.
  gcs::Client client{gcs::oauth2::CreateAnonymousCredentials()};

  // Read an object, the object must have been made public.
  gcs::ObjectReadStream stream = client.ReadObject(bucket_name, object_name);

  int count = 0;
  std::string line;
  while (std::getline(stream, line, '\n')) {
    ++count;
  }
  std::cout << "The object has " << count << " lines\n";
}

Java

For more information, see the Cloud Storage Java API reference documentation.

import com.google.cloud.storage.Blob;
import com.google.cloud.storage.BlobId;
import com.google.cloud.storage.Storage;
import com.google.cloud.storage.StorageOptions;
import java.nio.file.Path;

public class DownloadPublicObject {
  public static void downloadPublicObject(
      String bucketName, String publicObjectName, Path destFilePath) {
    // The name of the bucket to access
    // String bucketName = "my-bucket";

    // The name of the remote public file to download
    // String publicObjectName = "publicfile.txt";

    // The path to which the file should be downloaded
    // Path destFilePath = Paths.get("/local/path/to/file.txt");

    // Instantiate an anonymous Google Cloud Storage client, which can only access public files
    Storage storage = StorageOptions.getUnauthenticatedInstance().getService();

    Blob blob = storage.get(BlobId.of(bucketName, publicObjectName));
    blob.downloadTo(destFilePath);

    System.out.println(
        "Downloaded public object "
            + publicObjectName
            + " from bucket name "
            + bucketName
            + " to "
            + destFilePath);
  }
}

Python

For more information, see the Cloud Storage Python API reference documentation.

from google.cloud import storage


def download_public_file(bucket_name, source_blob_name, destination_file_name):
    """Downloads a public blob from the bucket."""
    # bucket_name = "your-bucket-name"
    # source_blob_name = "storage-object-name"
    # destination_file_name = "local/path/to/file"

    storage_client = storage.Client.create_anonymous_client()

    bucket = storage_client.bucket(bucket_name)
    blob = bucket.blob(source_blob_name)
    blob.download_to_filename(destination_file_name)

    print(
        "Downloaded public blob {} from bucket {} to {}.".format(
            source_blob_name, bucket.name, destination_file_name
        )
    )

What's next