Jump to Content
AI & Machine Learning

Google Cloud AI digitizes StoryCorps archive: largest collection of human voices on planet

November 18, 2020
Andrew Moore

Vice President & General Manager: Cloud AI & Industry Solutions

For many of us the holiday season will look different this year, separated from the people we love. If you’re in this boat too—mitigating the spread of the coronavirus—thank you and we hope the following story might offer an alternative, but helpful way to connect with friends and family. 

While we know virtual get-togethers can never fully match the intimacy of in-person conversations, they can keep us connected and maybe even preserve some special moments for future generations. 

In this spirit, we are sharing our collaboration with StoryCorps, a national non-profit organization dedicated to preserving humanity’s stories through 1:1 interviews. Over the past 17 years, StoryCorps has recorded with more than 600,000 people and sent those recordings to the U.S. Library of Congress where they are preserved for generations to come at the American Folklife Center. 

This is the world’s largest collection of human voices on the planet, but, it’s been relatively inaccessible. That’s when StoryCorps approached us to help make its rich archive of first-person history universally accessible and useful. 

StoryCorps + Google Cloud AI

In 2019, StoryCorps and Google Cloud partnered to unlock this amazing archive using artificial intelligence (AI) and create an open, searchable and accessible audio database for everyone to find and listen to first-hand perspectives from humanity’s most important moments. 

Diving into how this works: for an audio recording to be searchable, the audio file and “moments” or keywords within that file—needed to be tagged with terms for which you would search. 

  1. First we used Speech-to-Text API to transcribe the audio file.

  2. Then Natural Language API identified keywords and their salience from the transcription.

  3. The transcript and keywords were loaded to an Elastic Search index.

  4. Resulting in a searchable transcript on the StoryCorps Archive.

Here is an example of how these Cloud AI technologies work using an actual StoryCorps interview.

https://storage.googleapis.com/gweb-cloudblog-publish/original_images/StoryCorps.gif

Building empathy and understanding through connection 

StoryCorps’ mission is impressive. Not only is it preserving humanity’s stories, its aim is to “build connections between people and create a more just and compassionate world” by sharing those stories as widely as possible. This is where our path with StoryCorps crosses on a deeper level. 

Our mission for AI technology is one where everyone is accounted for, extending well beyond the training data in computer science departments. This deeper understanding could allow organizations in every sector to unlock new possibilities of what they have to offer while being inclusive, equitable and socially beneficial. 

But that’s our story to figure out and we’re working hard at it. Whatever you decide to do this holiday season, please stay safe. In the meantime, perhaps your family would like to use the StoryCorps platform or app to connect, preserve and share a story of your own.

Posted in