Google Cloud Big Data and Machine Learning Blog

Innovation in data processing and machine learning technology

Get to know your trees: US Forest Service (FIA) dataset now available in BigQuery

Tuesday, October 24, 2017

By Alicia Sullivan, Program Manager, Google Maps

Have you ever wondered how much forest or timberland falls within your state or county? Do you want to understand species composition and trees per acre for habitat conditions? Are you a researcher needing ground truth for your model? Are you interested in understanding whether the forest resources of an area can support a new mill or biomass chip plant?

Forests are a valuable and shared resource, in many cases crucial to local and regional ecosystems and industries. Conservation efforts have made great strides in recent decades to protect forests for their ecological and business value. To aid in these efforts, we’re publishing a streamlined version of a comprehensive forestry dataset for the US, accessible in Google BigQuery, Data Studio, and Google Cloud.

The US Forest Service (USFS) Forest Inventory and Analysis (FIA) program is an ongoing forest measurement program that provides a nationwide inventory of the forest resources in the United States and its territories. Founded in 1928, the program continuously measures plots in each state. Full documentation of the data and program are available through the USFS. This dataset has been critical for research, nationwide reports, monitoring of climate change, and changes in our nation’s forests. We’re excited to add it to our BigQuery public data catalog.

While this dataset is far from new, the technology behind BigQuery has allowed for some new improvements.

Pre-joined tables

One challenge with FIA data is that it’s in a highly normalized, relational structure and is most often downloaded on a per-state basis. If you’re interested in doing an analysis across states, or nationwide, preparing the dataset for analysis can be a daunting process.

To help make this dataset easier to use, the FIA tables in BigQuery contain all states in the same table. In addition, for common joins, we’ve also created two “super tables” that combine multiple tables. The plot_tree table is the plot and tree tables joined on the plot number. The population table is a join of all the population tables in the database, each record representing a plot (FIA documentation, pg. 7-3). We’ve also included the tables individually if that is a better fit for your analysis.

Figure 1: Select basic information for all plots and tree records for plots in King County, Washington. Note: State and county are FIPS state codes.

Code and column name expansion

SPCD = 202? What does that mean? Some features we added to these BigQuery tables include the expansion of column names to the full text field name. We also added additional columns that contain the text descriptions of the code, to make this dataset a little more human-friendly. For example, SPCD is expanded to Species_Code, and two additional columns describe 202, which are “species_common_name” = Douglas-fir and “species_scientific_name” = Pseudotsuga menziesii.

Figure 2: Example query for the same information as above, but with additional text columns for code definitions.

Example use case: Estimating timberland and forest land acres by state

One straightforward use for this data is to create population-level estimates, such as the number of acres of forest land or timberland in each state. But what’s the difference between these two classifications?

Forest land is defined as a full acre that “has at least 10 percent canopy cover by live tally trees of any size or has had at least 10 percent canopy cover of live tally species in the past, based on the presence of stumps, snags, or other evidence.”(FIA Documentation, pg 2-28).

Timberland is defined by the FIA as “forest land capable of producing in excess of 20 cubic feet per acre per year and not legally withdrawn from timber production, with a minimum area classification of 1 acre.” (FIA Documentation, pg 1-4).

What does this mean to a non-forester? Put simply, forest land is the number of acres that have at least 10% tree cover, and timberland is a subset of forest land that meets a minimum production of volume of wood per acre year over year, and is not legally restricted from timber production. As a concrete example, Mount Rainier National Park has forest land but no timberland because timber production is not permitted in national parks.

Let’s take a look at an example. This query leverages a view we created that contains plots and population information needed to estimate timberland acres by state. The query to generate the view (estimate_timberland_acres) can be found here

Figure 3: The resulting table gives you the estimate of timberland acres using the latest per-state FIA data.

The same can be done for forest land acres using this query. It’s also based on a view (estimate_forestland_acres) that we created with this query.

What about the other FIA tables?

To start, we brought in the most commonly used set of tables. If your research or analysis requires other tables in the FIA database, let us know at gcp-public-data@google.com, and we’ll import those tables as well.

A note on plot locations

The plot and plot_tree tables do include latitude and longitude fields, but these are not precise locations. Due to privacy concerns and the Food Security Act of 1985 (reference 7 USC 2276 § 1770), the USFS cannot provide more exact location data. Please see the FIA Documentation, pg 1-6 for a full description.

Coming soon in Earth Engine

If you are a Google Earth Engine user, stay tuned for an announcement on how to access this dataset in Earth Engine.

Take action

Making forest inventory data accessible and available can help with important tasks like climate change research, conservation efforts, and understanding fire behavior and risks. We hope this data supports more analysts to do critical fire prevention and conservation work, both within and outside of California, and we encourage you to try out these datasets for yourself and learn something new about US forest land.
  • Big Data Solutions

  • Product deep dives, technical comparisons, how-to's and tips and tricks for using the latest data processing and machine learning technologies.

  • Learn More

12 Months FREE TRIAL

Try BigQuery, Machine Learning and other cloud products and get $300 free credit to spend over 12 months.

TRY IT FREE