Cloud Genomics provides a variety of public datasets that you can access for free and integrate into your applications. Google hosts these datasets, providing public access to the data via the following methods:
File access is available from Cloud Storage. Files are available in BAM, VCF, and FASTA formats. Copy the files you need to local disk or a Compute Engine VM for access from your favorite bioinformatics tools.
For public data hosted by the community on Google, each data provider determines the modes of access they support.
Cloud Genomics genomic public datasets
- 1,000 Genomes
- Illumina Platinum Genomes
- Reference Genomes
- Simons Genome Diversity Project
- The Cancer Genome Atlas (TCGA)
Cloud Genomics annotation public datasets
List your public dataset on Cloud Storage
If you have questions about listing a public dataset on Cloud Storage, contact us at firstname.lastname@example.org.
List your public data set on BigQuery
If you have questions about listing a public data set in BigQuery, contact us at email@example.com.