The Cancer Genome Atlas data

The Institute for Systems Biology Cancer Genomics Cloud (ISB-CGC) provides access to datasets hosted on GCP. These datasets are based on data from The Cancer Genome Atlas (TCGA) project.

The ISB-CGC datasets include the following from 33 types of tumors:

  • Somatic mutation calls
  • Clinical data
  • mRNA and miRNA expression
  • DNA methylation
  • Protein expression

The ISB-CGC also provides the following GitHub repositories for trying out sample queries and analysis using R, Python, and Cloud Datalab:

Dataset access

BigQuery datasets

You can access the following datasets in BigQuery for data exploration and querying:

About the data

Use: This dataset is publicly available for anyone to use under the terms provided by the dataset source (https://cancergenome.nih.gov/) and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.

Was this page helpful? Let us know how we did:

Send feedback about...

Cloud Life Sciences