BigQuery public datasets

The Cloud Public Datasets Program catalog is in GCP Marketplace. You can find more details about each individual dataset by viewing the Marketplace pages in the Datasets section.

Go to Datasets in the GCP Marketplace

A public dataset is any dataset that is stored in BigQuery and made available to the general public through the Google Cloud Public Dataset Program. The public datasets are datasets that BigQuery hosts for you to access and integrate into your applications. Google pays for the storage of these datasets and provides public access to the data via a project. You pay only for the queries that you perform on the data (the first 1 TB per month is free, subject to query pricing details).

Before you begin

Public datasets are available for you to analyze using either legacy SQL or standard SQL queries. You can access BigQuery public data sets by using the BigQuery web UI in the GCP Console, the classic BigQuery web UI, the command-line tool, or by making calls to the BigQuery REST API using a variety of client libraries such as Java, .NET, or Python.

To get started using a BigQuery public dataset, you must create or select a project. The first terabyte of data processed per month is free, so you can start querying public datasets without enabling billing. If you intend to go beyond the free tier, you must also enable billing.

  1. Connectez-vous à votre compte Google.

    Si vous n'en possédez pas déjà un, vous devez en créer un.

  2. Sélectionnez ou créez un projet Google Cloud Platform.

    Accéder à la page "Gérer les ressources"

  3. Assurez-vous que la facturation est activée pour votre projet Google Cloud Platform.

    Découvrir comment activer la facturation

  4. BigQuery is automatically enabled in new projects. To activate BigQuery in a pre-existing project, Activez BigQuery API.

    Activez API

Public dataset locations

Currently, the BigQuery sample tables are stored in the US multi-region location. When you query a sample table, supply the --location=US flag on the command line, choose US as the processing location in the GCP Console or the classic BigQuery web UI, or specify the location property in the jobReference section of the job resource when you use the API. Because the sample tables are stored in the US, you cannot write sample table query results to a table in another region, and you cannot join sample tables with tables in another region.

Accessing public datasets in the BigQuery web UI

There are two user interfaces that can be used to access the public datasets:

The bigquery-public-data project is automatically pinned to every project in both UIs. You can find the project in the navigation pane.

To open the bigquery-public-data project manually, you can:

  • Enter the following URL in your browser to open the public datasets in the classic BigQuery web UI:
  • Enter the following URL to open the public datasets in the BigQuery web UI in the GCP Console:

To switch from the GCP Console to the classic web UI, see Switching to the classic web UI.

Other public datasets

There are many other public datasets available for you to query, some of which are also hosted by Google, but many more that are hosted by third parties. Other datasets include:

Sharing a dataset with the public

You can share any of your datasets with the public by changing the dataset's access controls to allow access by "All Authenticated Users". For more information about setting dataset access controls, see Controlling access to datasets.

When you share a dataset with the public:

  • Storage charges are incurred by the billing account attached to the project that contains the publicly-shared dataset.
  • Query charges are incurred by the billing account attached to the project where the query jobs are run.

For more information, see How charges are billed.

Sample tables

In addition to the public datasets, BigQuery provides a limited number of sample tables that you can query. These tables are contained in the bigquery-public-data:samples dataset.

The requirements for querying the BigQuery sample tables are the same as the requirements for querying the public datasets.

The bigquery-public-data:samples dataset includes the following tables:

Name Description
gsod Contains weather information collected by NOAA, such as precipitation amounts and wind speeds from late 1929 to early 2010.
github_nested Contains a timeline of actions such as pull requests and comments on GitHub repositories with a nested schema. Created in September 2012.
github_timeline Contains a timeline of actions such as pull requests and comments on GitHub repositories with a flat schema. Created in May 2012.
natality Describes all United States births registered in the 50 States, the District of Columbia, and New York City from 1969 to 2008.
shakespeare Contains a word index of the works of Shakespeare, giving the number of times each word appears in each corpus.
trigrams Contains English language trigrams from a sample of works published between 1520 and 2008.
wikipedia Contains the complete revision history for all Wikipedia articles up to April 2010.

Contact us

If you have any questions about the BigQuery public dataset program, contact us at

Cette page vous a-t-elle été utile ? Évaluez-la :

Envoyer des commentaires concernant…

Besoin d'aide ? Consultez notre page d'assistance.