Cloud Life Sciences public datasets

Cloud Life Sciences provides a variety of public datasets that you can access for free and integrate into your applications. Google hosts these datasets, providing public access to the data through the following methods:

  • Interactive access is available in the BigQuery console. You can explore variant calls in case/control and cohort analysis. There are sample queries to help you get started. For information on how to get started with public datasets in BigQuery, see BigQuery public datasets.

  • File access is available from Cloud Storage. Files are available in BAM, VCF, and FASTA formats. Copy the files you need to your local disk or a Compute Engine VM for access from your favorite bioinformatics tools. For information on how to get started with Cloud Storage, see How to use public datasets on Cloud Storage.

For public data hosted by the community on Google, each data provider determines the modes of access they support.

Cloud Life Sciences genomic public datasets

Cloud Life Sciences annotation public datasets

List your public dataset on Cloud Storage

If you have questions about listing a public dataset on Cloud Storage, contact us at gcp-public-data@google.com.

List your public data set on BigQuery

If you have questions about listing a public data set in BigQuery, contact us at bq-public-data@google.com.