Jump to Content
Data Analytics

What are the newest datasets in Google Cloud?

August 31, 2021
Michael Hamamoto Tribble

Head of Datasets for Google Cloud

Soleil Kelley

Product Marketing, Google Cloud

Try Google Cloud

Start building on Google Cloud with $300 in free credits and 20+ always free products.

Free trial

Editor’s note:  With Google Cloud’s datasets solution, you can access an ever-expanding resource of the newest datasets to support and empower your analyses and ML models, as well as frequently updated best practices on how to get the most out of any of our datasets. We will be regularly updating this blog with new datasets and announcements, so be sure to bookmark this link and check back often.

September 2021

New dataset: Regional Carbon-free Energy Data

Video Thumbnail

New dataset: U.S. Climate Gridded Dataset (NClimGrid)

  • NClimGrid​ ​​is a gridded dataset derived from NOAA NCEI's Global Historical Climatology Network (GHCN).​ It consists of four climate variables derived from the GHCN-D dataset: maximum temperature, minimum temperature, average temperature and precipitation. ​The data files provide monthly values in a 5km by 5km lat/lon grid for the Continental United States.​ This information is helpful in analyzing historical and regional climate trends​.

  • Access the dataset

New dataset: U.S. Climate Normals

  • Earlier in May, NOAA NCEI released the new 30-year U.S. Climate Normals for the time period 2011-2020. Climate normals are a statistically smoothed, quality-controlled, 30-year average of recent climate conditions. The U.S. Climate Normals collection is available for the following time periods: 1901-1930, 1911-1940, and so on through 1991-2020. Because they are updated once per decade, the Normals gradually come to reflect the "new normal" of climate change caused by global warming.​ Users of this data include agriculture​, construction, infrastructure and energy industry planners, to name a few.

  • Access the dataset

New dataset: Ocean Climate Stations Moorings (Keo & Papa)

  • The mission of the Ocean Climate Stations (OCS) Project is to make meteorological and oceanic measurements from autonomous platforms. Calibrated, quality-controlled, and well-documented climatological measurements are available on the OCS webpage and the OceanSITES Global Data Assembly Centers (GDACs), with near real-time data available prior to release of the complete, downloaded datasets. Kuroshio Extension Observatory (KEO) and Ocean Weather Station Papa (PAPA) are two stations providing data to this project.

  • Access the dataset

August 2021

New dataset: Google Cloud Release Notes

https://storage.googleapis.com/gweb-cloudblog-publish/images/1_Access_the_BigQuery_release_notes_datase.max-1800x1800.jpg
Access the BigQuery release notes dataset from https://cloud.google.com/release-notes/all

July 2021

Best practice: Use Google Trends data for common business needs

  • The Google Trends dataset represents the first time we’re adding Google-owned Search data into Datasets for Google Cloud. The Trends data allows users to measure interest in a particular topic or search term across Google Search, from around the United States, down to the city-level. You can learn more about the dataset here, and check out the Looker dashboard here! These tables are super valuable in their own right, but when you blend them with other actionable data you can unlock whole new areas of opportunity for your team. To learn how to make informed decisions with Google Trends data, keep reading.

  • Access the dataset

Video Thumbnail

New dataset: COVID-19 Vaccination Search Insights

  • With COVID-19 vaccinations being a topic of interest around the United States, this dataset shows aggregated, anonymized trends in searches related to COVID-19 vaccination and is intended to help public health officials design, target, and evaluate public education campaigns. Check out this interactive dashboard to explore searches for COVID-19 vaccination topics by region.

  • Access the dataset

https://storage.googleapis.com/gweb-cloudblog-publish/images/2_COVID-19_Vaccination_Search_Insights.max-900x900.jpg
Source: https://google-research.github.io/vaccination-search-insights/

June 2021

New dataset: Google Diversity Annual Report 2021

  • Since 2014, Google has disclosed data on the diversity of its workforce in an effort to bring candid transparency to the challenges technology companies like Google face in recruitment and retention of underrepresented communities. In an effort to make this data more accessible and useful, we've loaded it into BigQuery for the first time ever. To view Google's Diversity Annual Report and learn more, check it out.

  • Access the dataset

https://storage.googleapis.com/gweb-cloudblog-publish/images/historical_data_1.max-1400x1400.jpg

New dataset: Google Trends Top 25 Search terms

  • The most popular and surging Google Search terms are now available in BigQuery as a public dataset. View the Top 25 and Top 25 rising queries from Google Trends from the past 30-days, including 5 years of historical data across the 210 Designated Market Areas (DMAs) in the US. Keep reading.

  • Access the dataset

https://storage.googleapis.com/gweb-cloudblog-publish/images/3_Google_Trends_Top_25_Search_terms.max-1700x1700.jpg
Top 25 Google Search terms, ranked by search volume (1 through 25) and with average search index score across the geographic areas (DMAs) in which it was searched.

New dataset: COVID-19 Vaccination Access

  • With metrics quantifying travel times to COVID-19 vaccination sites, this dataset is intended to help Public Health officials, researchers, and Healthcare Providers to identify areas with insufficient access, deploy interventions, and research these issues. Check out how this data is being used in a number of new tools.

  • Access the dataset

https://storage.googleapis.com/gweb-cloudblog-publish/images/4_COVID-19_Vaccination_Access.max-700x700.jpg
https://storage.googleapis.com/gweb-cloudblog-publish/images/4_EMyEAbR.max-2000x2000.jpg
(Image courtesy of Vaccine Equity Planner, https://vaccineplanner.org/)

Best practice: Leveraging BigQuery Public Boundaries datasets for geospatial analytics 

  • Geospatial data is a critical component for a comprehensive analytics strategy. Whether you are trying to visualize data using geospatial parameters or do deeper analysis or modeling on customer distribution or proximity, most organizations have some type of geospatial data they would like to use - whether it be customer zipcodes, store locations, or shipping addresses. However, converting geographic data into the correct format for analysis and aggregation at different levels can be difficult. In this post, we’ll walk through some examples of how you can leverage the Google Cloud platform alongside Google Cloud Public Datasets to perform robust analytics on geographic data. Keep reading.

  • Access the dataset

https://storage.googleapis.com/gweb-cloudblog-publish/original_images/5_geospatial_analytics_.gif

Get the metadata and try BigQuery sandbox 

When you’ve learned about many of our datasets and pre-built solutions from across Google, you may be ready to start querying them. Check out the full dataset directory and read all the metadata at g.co/cloud/marketplace-datasets, then dig into the data with our free-to-use BigQuery sandbox account, or $300 in credits with our Google Cloud free trial.

Posted in