Learn about our latest innovations across databases, analytics, and AI at the Data Cloud Summit. Live and on demand starting May 26.

Datasets

Enhance your analytics and AI initiatives with pre-built data solutions and valuable datasets powered by BigQuery, Cloud Storage, Earth Engine, and other Google Cloud services. 

Expand your data ecosystem

Increase the value of your data assets when you augment your analytics or AI initiatives with external data. Discover and access unique and valuable datasets and pre-built solutions from Google, public, or commercial providers. With fully managed data pipelines, you can stay focused on what matters most: delivering insights and business value.

Category Featured datasets Sample queries, use cases, and solutions
Google datasets

View the Top 25 and Top 25 rising queries from Google Trends from the past 30-days with this dataset. Each term includes 5 years of historical data across the 210 Designated Market Areas (DMAs) in the US.

  • check_circle_filled_black_24dp (1)

    What's top of mind for listeners in my news/radio broadcast area?

  • check_circle_filled_black_24dp (1)

    What are the top search terms in the US for the latest available data?

  • check_circle_filled_black_24dp (1)

    What are the most popular retail items people have searched for across the area?

Community Mobility Reports

This dataset aims to provide insights into what has changed in response to policies aimed at combating COVID-19. It reports movement trends over time by geography, across different categories of places such as retail and recreation, groceries and pharmacies, parks, transit stations, workplaces, and residential.

  • check_circle_filled_black_24dp (1)

    What was the impact of San Francisco's shelter-in-place order on retail visits?

  • check_circle_filled_black_24dp (1)

    Use case: Identifying the difference in retail traffic on weekends

Google Analytics (Sample)

The dataset provides 12 months (August 2016 to August 2017) of obfuscated Google Analytics 360 data from the Google Merchandise Store, a real ecommerce store that sells Google-branded merchandise. The data is typical of what an ecommerce website would see and includes traffic source data, content data, and transactional data.

Google Patents Research

Google Patents Research Data contains the output of much of the data analysis work used in Google Patents (patents.google.com), including machine translations of titles and abstracts from Google Translate, embedding vectors, extracted top terms, similar documents, and forward references.

  • check_circle_filled_black_24dp (1)

    What are the 20 most recent patents filed?

  • check_circle_filled_black_24dp (1)

    Which Fortune 500 firms actively filed patents between 2017–2019?  

Public datasets
Severe Storm Event Details

The Storm Events Database is an integrated database of severe weather events across the United States from 1950 to this year, with information about a storm event's location, azimuth, distance, impact, and severity, including the cost of damages to property and crops.

  • check_circle_filled_black_24dp (1)

    Which storms that occurred in the last 15 years caused the most property damage?

  • check_circle_filled_black_24dp (1)
  • check_circle_filled_black_24dp (1)

    Use case: home improvement retailer understanding impact of storms on inventory

Census Bureau US Boundaries

These are full-resolution boundary files, derived from TIGER/Line Shapefiles, the fully supported, core geographic products from the US Census Bureau.These include information for the 50 states, the District of Columbia, Puerto Rico, and the outlying island areas.

  • check_circle_filled_black_24dp (1)

    Which US cities have the most public airports within 10 miles of its urban area?

  • check_circle_filled_black_24dp (1)

    Use case: Developing an urbanization index for retailers

American Community Survey

The American Community Survey (ACS) is an ongoing survey that provides vital information on a yearly basis about our nation and its people by contacting over 3.5 million households across the country. The resulting data provides incredibly detailed demographic information across the US aggregated at various geographic levels.

  • check_circle_filled_black_24dp (1)

    How have rents as a share of median income changed year over year?  

  • check_circle_filled_black_24dp (1)

    Use case: Population growth trends as inputs to facility/site selection analysis

All public datasets

Search for and access over 200 datasets listed in Google Cloud Marketplace.

  • check_circle_filled_black_24dp (1)

    What datasets can help provide deeper context for our analytics or ai workflows?

Commercial datasets
Crux Informatics

Crux Deliver is a managed service for data engineering and operations. Crux wires up all of the traditional and alternative data providers on behalf of its clients and manages all aspects of onboarding, data engineering, and operations. Every dataset is validated so that we only deliver clean and actionable data. 

  • check_circle_filled_black_24dp (1)

    What are the datasets Crux can help me onboard into my data ecosystem?

HouseCanary

Instant access to reliable property, loan and valuation information for 100M homes. ML algorithms process hundreds of data sources to provide Home Price Indices for 381 Metros, 18,300 ZIP codes and 4M blocks covering >95% of the US residential market. Make investment decisions based on 40-year historical volatility information and 3-year forecasts.

  • check_circle_filled_black_24dp (1)

    Which ZIP codes are forecasted to have home prices gain 3% or more next year?

  • check_circle_filled_black_24dp (1)

    What's the value of a particular property?

Earth Engine datasets
Earth Engine

Earth Engine's public data archive includes more than forty years of historical imagery and scientific datasets, updated daily and available for online analysis.

  • check_circle_filled_black_24dp (1)

    How has surface temperature changed over the past 30 years?

  • check_circle_filled_black_24dp (1)

    What did this area look like before year 2000?

Kaggle datasets
Kaggle Datasets

Inside Kaggle you’ll find all the code and data you need to do your data science work. Use over 80,000 public datasets and 400,000 public notebooks to conquer any analysis in no time.

  • check_circle_filled_black_24dp (1)

    Can you tackle some of the most vexing and provocative problems in data science?

Synthetic datasets
Cymbal Investments

The synthetic data represents transactions from automated trading bots operated by the fictional Cymbal Investments group, each using a single algorithm to guide its trading decisions. The records are derived from FIX protocol (version 4.4) Trade Capture Reports  loaded into BigQuery. 

  • check_circle_filled_black_24dp (1)

    How much did traders make from each individual trade?

Research datasets

Google's Dataset Search program has indexed almost 25 million datasets from across the web, giving you a single place to search for datasets and find links to where the data is. Filter by recency, format, topic, and more. 

  • check_circle_filled_black_24dp (1)

    What datasets exist for < keyword you're interested in >? 

  • check_circle_filled_black_24dp (1)

    Which sustainability datasets exist from last year are free for commercial use?

Category

Feeling inspired? Let’s solve your challenges together.

Learn how Google Cloud datasets transform the way your business operates with data and pre-built solutions.
Contact sales
If there is a public dataset you would like to see onboarded, please contact public-data-help@google.com.

With BigQuery sandbox, you can try the full BigQuery experience without a billing account or credit card.

Data partners and customer stories

Learn more from both sides of the dataset ecosystem: data providers and data consumers.