USA Contagious Disease Data

How to query public data sets using BigQuery

BigQuery is a fully managed data warehouse and analytics platform. Public datasets are available for you to analyze using SQL queries. You can access BigQuery public data sets using the web UI the command-line tool, or by making calls to the BigQuery REST API using a variety of client libraries such as Java, .NET, or Python.

To get started using a BigQuery public dataset, create or select a project. The first terabyte of data processed per month is free, so you can start querying public datasets without enabling billing. If you intend to go beyond the free tier, you should also enable billing.

  1. Sign in to your Google account.

    If you don't already have one, sign up for a new account.

  2. Select or create a Cloud Platform project.

    Go to the Manage resources page

  3. Enable billing for your project.

    Enable billing

  4. BigQuery is automatically enabled in new projects. To activate BigQuery in a pre-existing project, Enable the BigQuery API.

    Enable the API

Dataset overview

This public data is published by the US Department of Health and Human Services and includes all weekly surveillance reports of nationally notifiable diseases for all U.S. cities and states published between 1888 and 2013. The data set consists of eight important vaccine-preventable contagious diseases: diphtheria, hepatitis A, measles, mumps, pertussis, polio, rubella and smallpox.

You can start exploring this data in the BigQuery console:

Go to the USA Contagious Disease Dataset

Sample queries

Here are some examples of SQL queries you can run on this data in BigQuery.

These samples use BigQuery’s legacy SQL by setting the #legacySQL prefix. For more information, see Setting a query prefix.

Diseases by year

Comparing Mumps outbreak in California and Connecticut

Mumps shows the same seasonal pattern coast to coast, in both California and Connecticut, during the 1970 outbreak.

About the data

Dataset Source: Data.gov

Category: Health

Use: This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source — http://www.data.gov/privacy-policy#data_policy — and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.

View in BigQuery: Go to the USA Contagious Diseasae Dataset

Monitor your resources on the go

Get the Google Cloud Console app to help you manage your projects.

Send feedback about...