The Cancer Imaging Archive (TCIA) hosts collections of de-identified medical images, primarily in DICOM format. Collections are organized according to disease (such as lung cancer), image modality (such as MRI or CT), or research focus.
The Cloud Healthcare API provides access to these datasets via Google Cloud Platform (GCP), as described in GCP data access.
License and attribution
The TCIA public access datasets are available under the Creative Commons Attribution 3.0 Unported License. Most collections are "freely available to browse, download, and use for commercial, scientific and educational purposes." For details, see the TCIA Data Usage Policies and Restrictions.
For each collection you use, cite both the TCIA in general and the specific sources for the collection.
Cite the following general TCIA publication:
Clark K, Vendt B, Smith K, Freymann J, Kirby J, Koppel P, Moore S, Phillips S, Maffitt D, Pringle M, Tarbox L, Prior F. The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository, Journal of Digital Imaging, Volume 26, Number 6, December, 2013, pp 1045-1057. (paper)
Each TCIA collection has specific citation requirements. These may be data citations, publication citations, or both. Some collections also require attribution for additional data sources.
GCP data access
You can get the TCIA datasets from Cloud Storage, BigQuery, or using the Cloud Healthcare API.
Each TCIA dataset is available in a Cloud Storage bucket within
the Google Cloud Platform project named
Dataset bucket names are in the following format:
To find the DATASET_ID, refer to the TCIA
section. The last portion of the attribution page URL (immediately preceding
.html) corresponds to the dataset ID. For example, the
TCGA-BRCA citations page has the
The dataset ID is
tcga-brca. The corresponding Cloud Storage bucket is:
Within each bucket, the data is organized as follows:
Each Cloud Storage bucket uses the "Requester Pays" model for billing. Your GCP project will be billed for the charges associated with accessing the NIH data. For more information, see Requester Pays.
Each TCIA dataset is available in BigQuery in
chc-tcia Google Cloud Platform project.
For information about accessing public data in BigQuery, see BigQuery public datasets.
Cloud Healthcare API
Each TCIA dataset is available in the Cloud Healthcare API in the
You can also use the IMS viewer with the Cloud Healthcare API: