The NIH Chest X-ray dataset consists of 100,000 de-identified images of chest X-rays. The images are in PNG format.
The data is provided by the NIH Clinical Center and is available through the NIH download site: https://nihcc.app.box.com/v/ChestXray-NIHCC
You can also access the data via Google Cloud Platform (GCP), as described below.
License and attribution
There are no restrictions on the use of the NIH Chest X-ray images. However, the dataset has the following attribution requirements:
Provide a link to the NIH download site: https://nihcc.app.box.com/v/ChestXray- NIHCC
Include a citation to the CVPR 2017 paper:
Xiaosong Wang, Yifan Peng, Le Lu, Zhiyong Lu, Mohammadhadi Bagheri, Ronald Summers, ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases, IEEE CVPR, pp. 3462-3471, 2017
Acknowledge that the NIH Clinical Center is the data provider
GCP data access
You can get the NIH chest x-ray images from Cloud Storage or via the Healthcare API.
The NIH chest x-ray data is available in the following Cloud Storage bucket:
Use Google Cloud Platform Console to view the bucket.
The bucket includes paths to the original PNG files, as well as to DICOM instances:
PNG (provided by NIH):
DICOM (provided by Google):
The Cloud Storage bucket uses the "Requester Pays" model for billing. Your GCP project will be billed for the charges associated with accessing the NIH data. For more information, see Requester Pays.
To request access to the NIH dataset via the Healthcare API, complete this form.
The data uses the following DICOM store hierarchy:
After you've been given access to the dataset, you can also use the viewers that are integrated with the Healthcare API: