COVID-19 public datasets: supporting organizations in their pandemic response
Program Manager, Healthcare & Life Sciences, Google Cloud
Program Manager, Google Cloud
Editor’s note: This is part two of a series on the COVID-19 public datasets. Check out part one to learn more about recently onboarded datasets and new program expansion.
Back in March, we launched new COVID-19 public datasets into our Google Cloud Public Datasets program to make critical COVID-19 datasets available to the public and free to analyze using BigQuery.
At launch, we aimed to get high-quality data into the hands of users as quickly as possible to support their efforts to monitor and understand the emergent pandemic. A few months in, we have expanded our original goals to include supporting public and private sector users with the data that they need to make informed decisions. Today, we’ll highlight how research organizations, governments, and partners have used these datasets to power their decisions, contribute to the growing body of research on the virus and its societal impacts, and create tools to support response efforts.
Helping communities respond to COVID-19
Reliable data is now more important than ever as leaders in healthcare, government, and private industry are challenged to make decisions in response to COVID-19. To equip organizations in charting the safest path forward, Google Cloud collaborated with Google Cloud partner SADA to build the National Response Portal.
The portal is an open data platform that combines many relevant datasets for an on-the-ground view of the pandemic. “The National Response Portal takes full advantage of the Google Cloud Public Datasets program, giving us direct and easy access to the COVID-19 datasets that power our visualizations,” says Michael Ames, senior director of healthcare and life sciences at SADA.
Via the portal, users can explore trends on COVID-19 cases and deaths, view forecasts anticipating future hotspots, and examine the impact of policy decisions and social mobility. Healthcare providers have begun contributing data as part of a growing effort to share data insights among the health community to empower better awareness and decision-making.
To find out more and view the portal, check it out here.
Equipping the public sector to monitor COVID-19
When looking for a technical solution for monitoring COVID-19 cases and updating residents, the Oklahoma State Department of Health and the governor’s office turned to Google Cloud. The state needed a public-facing platform that would display real-time data on the pandemic. Using the COVID-19 public datasets along with Looker, Google Cloud’s business intelligence and analytics platform, the State of Oklahoma built a dashboard on Oklahoma COVID-19 statistics, located on the state’s public health website.
Since the dashboard launched, it has been viewed tens of thousands of times each day. Department of Health staff and Oklahoma citizens are able to access and interact with consolidated information served by Looker dashboards for actionable insights. “The partnership with Google Cloud has enabled the OK Department of Health to be extremely agile in keeping the citizens of Oklahoma informed as to the impact of COVID-19 across the state,” says State of Oklahoma Digital Transformation Secretary David Ostrowe. The dashboard has decreased manual processing needs and has been easy to update and deploy changes over Google Cloud. The State of Oklahoma also received an A+ COVID-19 data quality rating from the COVID Tracking Project.
Supporting research on COVID-19
In the early days of the pandemic, Northeastern University used Google Cloud to model COVID-19 and forecast the impact that interventions like stay-at-home-orders would have on the spread of the virus.
Northeastern University researchers used several Google Cloud products, including BigQuery, to analyze various datasets and inform their global metapopulation disease transmission model. The team relied on the U.S. Census Data and OpenStreetMap public datasets and BigQuery GIS capabilities to project the impact of different interventions on the global spread of the COVID-19 pandemic.
"Our team models and forecasts the spatial spread of infectious diseases by quickly analyzing hundreds of terabytes of simulation data,” says Dr. Matteo Chinazzi, associate research scientist at Northeastern University. “With the help of BigQuery, we are able to accelerate insights from our epidemic models and better study evolution of an ongoing outbreak.”
Dr. Chinazzi’s team has provided valuable insights on the effects of different containment and mitigation strategies. The team’s findings were published in Science in April. You can check them out through The Global Epidemic and Mobility (GLEAM) Project interactive dashboards.
Visualizing the pandemic
CARTO combined census data with COVID-19 case data and social determinants of health datasets in this real-time dashboard to support organizations in monitoring and responding to the pandemic.
“We built our COVID-19 dashboard to anticipate viewers looking for fast answers,” says Stephanie Schober, CARTO solution engineer. “As COVID-19 continues to spread, Google Cloud's BigQuery content has enabled our dashboard to use real-time and reliable data."
”Location data has been extremely relevant through this pandemic to ensure both private and public sector organizations can respond fast enough,” says Florence Broderick, VP of marketing at CARTO. “Geospatial analysis through CARTO and BigQuery has enabled a wide range of use cases, including PPE distribution, mobility analysis, and workplace-return planning."
If you’re interested in developing similar visualizations, check out more details from CARTO and tune into Data vs. COVID-19: How Public Data is Helping Flatten the Curve.
Analyzing the global COVID-19 news narrative from web to television
To support researchers in analyzing global media coverage of COVID-19 and comparing with outbreaks of the past decade, we have partnered with the GDELT Project to host several multimodal datasets.
These datasets include media coverage across 152 languages and span more than a decade, totaling more than 3 trillion data points, all of them available as public datasets in BigQuery. "Google Cloud’s AI offerings make it possible to transform text, speech, imagery and video into rich annotations sharing a common taxonomy,” says GDELT Founder Dr. Kalev Leetaru. “BigQuery is the lens through which trillions of data points become actionable insights that can help guide our understanding of the global COVID-19 media narrative."
A Google Cloud COVID-19 research grant is also supporting additional data annotation on the COVID-19 pandemic and other major disease outbreaks. The project is using Cloud Speech-to-Text to compare COVID-19 radio coverage on 10 major U.S. stations. When completed, this dataset will make it possible for researchers to understand how television and radio coverage of the pandemic compares with online coverage.
Helping companies manage operations throughout the pandemic
In the private sector, organizations have leveraged the COVID-19 datasets to support decision making in responding to the pandemic.
Rolls-Royce joined with Google Cloud and other industry partners to form the Emergent Alliance. This data analytics coalition plans to leverage Google Cloud’s datasets in finding ways to support the global response to the pandemic, model economic recovery, and support return-to-work initiatives.
When we launched COVID-19 public datasets, we set out on a mission to partner with data owners and make critical datasets easily accessible and free of analysis costs. We are inspired by the many organizations across healthcare, government, academia, and private industry that have led the way applying this data in innovative ways, supporting global response efforts. As communities continue to navigate the challenging path forward, we hope to play a small part in empowering them with data insights to prepare for what comes next.