United States Census Bureau International Data

How to query public data sets using BigQuery

BigQuery is a fully managed data warehouse and analytics platform. Public datasets are available for you to analyze using SQL queries. You can access BigQuery public data sets using the web UI the command-line tool, or by making calls to the BigQuery REST API using a variety of client libraries such as Java, .NET, or Python.

To get started using a BigQuery public dataset, create or select a project. The first terabyte of data processed per month is free, so you can start querying public datasets without enabling billing. If you intend to go beyond the free tier, you should also enable billing.

  1. Sign in to your Google account.

    If you don't already have one, sign up for a new account.

  2. Select or create a Cloud Platform project.

    Go to the Manage resources page

  3. Enable billing for your project.

    Enable billing

  4. BigQuery is automatically enabled in new projects. To activate BigQuery in a pre-existing project, Enable the BigQuery API.

    Enable the API

Dataset overview

The United States Census Bureau’s international dataset provides estimates of country populations since 1950 and projections through 2050. Specifically, the dataset includes midyear population figures broken down by age and gender assignment at birth. Additionally, time-series data is provided for attributes including fertility rates, birth rates, death rates, and migration rates.

You can start exploring this data in the BigQuery console:

Go to US Census Bureau international data

Sample queries

Here are some examples of SQL queries you can run on this data in BigQuery.

These samples use BigQuery’s support for standard SQL. Use the #standardSQL tag to let BigQuery know you want to use standard SQL. For more information about the #standardSQL prefix, see Setting a query prefix.

What countries have the longest life expectancy?

Because this query is visualized in Data Studio, the query results should be limited to countries greater than a certain size so the data can be represented well on a world map. In this query, 2016 census information is retrieved by joining the mortality_life_expectancy and country_names_area tables for countries larger than 25,000 km2. Without the size constraint, Monaco is the top result with an average life expectancy of over 89 years! However, Monaco is only 2 km2 so it can easily be lost on a world map.

Visualizing the query results in Google Data Studio produces the following map.

Life expectancy by country

To see a sample visualization, go to Data Studio. For a tutorial on using Data Studio with BigQuery, see Visualizing BigQuery Data Using Google Data Studio.

Web UI

Open the following query in the Web UI

#standardSQL
SELECT
  age.country_name,
  age.life_expectancy,
  size.country_area
FROM (
  SELECT
    country_name,
    life_expectancy
  FROM
    `bigquery-public-data.census_bureau_international.mortality_life_expectancy`
  WHERE
    year = 2016) age
INNER JOIN (
  SELECT
    country_name,
    country_area
  FROM
    `bigquery-public-data.census_bureau_international.country_names_area` where country_area > 25000) size
ON
  age.country_name = size.country_name
ORDER BY
  2 DESC
/* Limit removed for Data Studio Visualization */
LIMIT
  10

Command-line

bq query --use_legacy_sql=false '
SELECT
  age.country_name,
  age.life_expectancy,
  size.country_area
FROM (
  SELECT
    country_name,
    life_expectancy
  FROM
    `bigquery-public-data.census_bureau_international.mortality_life_expectancy`
  WHERE
    year = 2016) age
INNER JOIN (
  SELECT
    country_name,
    country_area
  FROM
    `bigquery-public-data.census_bureau_international.country_names_area` where country_area > 25000) size
ON
  age.country_name = size.country_name
ORDER BY
  2 DESC
/* Limit removed for Data Studio Visualization */
LIMIT
  10'

Sample results are shown here:

+---------------+------------------+---------------+
| country_name  | life_expectancy  | country_area  |
+---------------+------------------+---------------+
| Japan         | 85.0             | 364485.0      |
| Iceland       | 83.02            | 100250.0      |
| Switzerland   | 82.57            | 39997.0       |
| Korea, South  | 82.43            | 96920.0       |
| Australia     | 82.23            | 7682300.0     |
| Italy         | 82.2             | 294140.0      |
| Sweden        | 82.06            | 410335.0      |
| Canada        | 81.85            | 9093507.0     |
| France        | 81.84            | 640427.0      |
| Norway        | 81.78            | 304282.0      |
+---------------+------------------+---------------+

Which countries have the largest proportion of their population under 25?

Over 40% of the world’s population is under 25 and greater than 50% of the world’s population is under 30! This query retrieves the countries with the largest proportion of young people by joining the age-specific population table with the midyear (total) population table.

Web UI

Open the following query in the Web UI

#standardSQL
SELECT
  age.country_name,
  SUM(age.population) AS under_25,
  pop.midyear_population AS total,
  ROUND((SUM(age.population) / pop.midyear_population) * 100,2) AS pct_under_25
FROM (
  SELECT
    country_name,
    population,
    country_code
  FROM
    `bigquery-public-data.census_bureau_international.midyear_population_agespecific`
  WHERE
    year =2017
    AND age < 25) age
INNER JOIN (
  SELECT
    midyear_population,
    country_code
  FROM
    `bigquery-public-data.census_bureau_international.midyear_population`
  WHERE
    year = 2017) pop
ON
  age.country_code = pop.country_code
GROUP BY
  1,
  3
ORDER BY
  4 DESC
/* Remove limit for visualization */
LIMIT
  10

Command-line

bq query --use_legacy_sql=false '
SELECT
  age.country_name,
  SUM(age.population) AS under_25,
  pop.midyear_population AS total,
  ROUND((SUM(age.population) / pop.midyear_population) * 100,2) AS pct_under_25
FROM (
  SELECT
    country_name,
    population,
    country_code
  FROM
    `bigquery-public-data.census_bureau_international.midyear_population_agespecific`
  WHERE
    year =2017
    AND age < 25) age
INNER JOIN (
  SELECT
    midyear_population,
    country_code
  FROM
    `bigquery-public-data.census_bureau_international.midyear_population`
  WHERE
    year = 2017) pop
ON
  age.country_code = pop.country_code
GROUP BY
  1,
  3
ORDER BY
  4 DESC
/* Remove limit for visualization */
LIMIT
  10'

Sample results are shown here:

+----------------+-----------+------------+---------------+
| country_name   | under_25  | total      | pct_under_25  |
+----------------+-----------+------------+---------------+
| Uganda         | 27361075  | 39570125   | 69.15         |
| Niger          | 13108162  | 19245344   | 68.11         |
| Malawi         | 12840006  | 19196246   | 66.89         |
| Mali           | 11939567  | 17989813   | 66.37         |
| Mozambique     | 17614963  | 26573706   | 66.29         |
| Gaza Strip     | 1185377   | 1795183    | 66.03         |
| Zambia         | 10544838  | 15972000   | 66.02         |
| Burkina Faso   | 13060451  | 20107509   | 64.95         |
| South Sudan    | 8457121   | 13026129   | 64.92         |
| Burundi        | 7421013   | 11466756   | 64.72         |
+----------------+-----------+------------+---------------+

Visualizing the query results in Google Data Studio produces the following map.

Population under 25 by country

To see a sample visualization, go to Data Studio.

Which countries are seeing the largest net migration?

The International Census dataset contains growth information in the form of birth rates, death rates, and migration rates. Net migration is the net number of migrants per 1,000 population, an important component of total population and one that often drives the work of the United Nations Refugee Agency. This query joins the growth rate table with the area table to retrieve 2017 data for countries greater than 500 km2.

Web UI

Open the following query in the Web UI

#standardSQL
SELECT
  growth.country_name,
  growth.net_migration,
  CAST(area.country_area as INT64) as country_area
FROM (
  SELECT
    country_name,
    net_migration,
    country_code
  FROM
    `bigquery-public-data.census_bureau_international.birth_death_growth_rates`
  WHERE
    year = 2017) growth
INNER JOIN (
  SELECT
    country_area,
    country_code
  FROM
    `bigquery-public-data.census_bureau_international.country_names_area`
  WHERE
    country_area > 500) area
ON
  growth.country_code = area.country_code
ORDER BY
  net_migration DESC
LIMIT
  10

Command-line

bq query --use_legacy_sql=false '
SELECT
  growth.country_name,
  growth.net_migration,
  CAST(area.country_area as INT64) as country_area
FROM (
  SELECT
    country_name,
    net_migration,
    country_code
  FROM
    `bigquery-public-data.census_bureau_international.birth_death_growth_rates`
  WHERE
    year = 2017) growth
INNER JOIN (
  SELECT
    country_area,
    country_code
  FROM
    `bigquery-public-data.census_bureau_international.country_names_area`
  WHERE
    country_area > 500) area
ON
  growth.country_code = area.country_code
ORDER BY
  net_migration DESC
LIMIT
  10'

Sample results are shown here:

+--------------------------+----------------+---------------+
| country_name             | net_migration  | country_area  |
+--------------------------+----------------+---------------+
| Syria                    | 61.46          | 183630        |
| Luxembourg               | 15.52          | 2586          |
| Qatar                    | 14.61          | 11586         |
| Singapore                | 13.1           | 687           |
| Bahrain                  | 12.07          | 760           |
| South Sudan              | 10.6           | 644329        |
| United Arab Emirates     | 10.48          | 83600         |
| Turks and Caicos Islands | 9.51           | 948           |
| Cyprus                   | 8.7            | 9241          |
| Spain                    | 7.76           | 498980        |
+--------------------------+----------------+---------------+

About the data

Dataset Source: US Census Bureau

Category: Encyclopedic, Census, Demographic

Use: This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - http://www.data.gov/privacy-policy#data_policy — and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.

Update Frequency: Historical

View in BigQuery: Go to US Census Bureau international data

Monitor your resources on the go

Get the Google Cloud Console app to help you manage your projects.

Send feedback about...