Accelerating biomedical research with Terra and Google Cloud
David Glazer
Terra CTO, Verily
Alexander Titus, PhD
Strategic Business Executive, Global Public Sector, Google Cloud
When we think about biology, data is not always the first thing that comes to mind. But in reality, biology is one of the most fascinating data challenges of our time, and we’ve only begun to scratch the surface of understanding, organizing, and using this information. Over the past few years, significant advances in biomedical research have come to life with the emergence of better and more cost-effective tools. Cloud technologies, such as those offered by Google Cloud, provide computational resources capable of analyzing massive amounts of biomedical information at unprecedented speed, lower cost, and with democratized access to the leading tools. In a cloud environment, you don’t need to own a data center to do research at a global scale. Today, the use of cloud-based tools enables analyses across petabytes of biomedical data to identify patterns and markers for disease predisposition, prediction, and causality. These analyses help healthcare providers understand and treat disease, drug developers develop new therapies, and research investigators tackle big problems in science.
Bringing together engineering, biomedical science, and the research community
For years Verily and the Broad Institute have been working with partners to enable the next generation of collaborative biomedical research and to raise the bar for health data analysis around the world - Terra is a result of this work.
Terra is a secure, scalable, open-source platform for biomedical researchers to access data, run analysis tools and collaborate. The cloud-based platform is co-developed by the Broad Institute of MIT and Harvard, Microsoft, and Verily - an Alphabet company, with the goal of accelerating biomedical innovation.
Across Alphabet, there are ongoing efforts to bring AI/ML and cloud computing to tackle hard problems in the life sciences, and Terra on Google Cloud combines the best of both AI/ML and cloud. Thousands of researchers actively use Terra every month to analyze data from diverse research programs to accelerate biomedical innovation. These programs include the NIH All of Us Research Program, the Human Cell Atlas, the NHGRI AnVIL, the NHLBI BioData Catalyst, and the NCI Cloud Resources.
Accelerating biomedicine with Terra
A central tenet to Google’s DNA is open-source collaboration, and biomedical research is no different. We know that science doesn't happen in a vacuum, so we build our tools to enable global scale collaboration across industry, academia, and government.
Accelerating Medicine Partnership in Parkinson’s Disease (AMP PD)
In 2018, through the Foundation for the NIH (FNIH), an AMP partnership was launched for Parkinson’s disease between the NIH, the FDA, the Michael J. Fox Foundation for Parkinson's Research, a number of leading pharmaceutical companies, and Verily. A critical component of the work under this partnership is providing the biomedical community with open access to the data and analyses. Terra enables collaboration across the program, harmonizing data from seven different clinical studies with different protocols, policies, and institutions. The collaboration brings together data from over 10,000 participants with various harmonized multi-omic read-outs, including whole genome sequencing, RNAseq, and curated longitudinal clinical data on most participants. The platform is also being used for the Global Parkinson's Genetic Program (GP2), which provides complementary data on genetic associations in diverse populations on a biobank scale.
National Institutes of Health (NIH) All of Us
The NIH All of Us Research Program aims to enable thousands of studies on a wide range of diseases and is driven by the belief that diversity is key to accelerating health research and a better understanding of health disparities. The program brings together over 100 institutions across more than 350 clinical sites with over 378,000 (and counting) participants. Program data is available to approved researchers via the Researcher Workbench, which is powered by Terra on Google Cloud. This workbench helps researchers learn more about the health impact of individual differences in biology, lifestyle, and environment in order to advance precision diagnosis, prevention, and treatment.
Verily’s Baseline Health Study
Through Alphabet’s own research initiatives, we understand the challenges to effective collaboration, data sharing, and analysis, which is why we use the same products and tools that we share with the community. In 2017, Verily launched Project Baseline and the Baseline Health Study alongside Google and collaborators like Stanford Medicine and Duke University School of Medicine to make it easy and engaging for people to contribute to our understanding of clinical and biomedical research. Terra is the central platform that researchers use to explore and analyze molecular, device, and clinical data collected from the Baseline Health Study.
Google Cloud for biomedical research
Google Cloud has a broad range of genomic services and tools developed over the years, many of which are available in Terra today. One of these most commonly used by our customers is the Cloud Life Sciences API, which integrates into the tools that bioinformaticians use today (Cromwell, Nextflow, BC, BlueBee, Sentieon, Galaxy, and many others) to enable the life sciences community to process biomedical data at scale.
Variant Transforms is an open source tool for loading and transforming raw VCF (variant call format) data for use in advanced analytics. Genomics research teams can use Variant Transforms to load massive amounts of raw VCF data, validate it for inconsistencies, and then transform it into BigQuery for use in Machine Learning (ML) and additional analysis.
Google Cloud also has a number of solutions to integrate research data and workflows into clinical settings. The Cloud Healthcare API allows easy and standardized data exchange between diverse healthcare applications and solutions. With support for popular healthcare data standards such as HL7, FHIR, and DICOM, the Cloud Healthcare API provides a fully managed, highly scalable, enterprise-grade development environment for building clinical and analytics solutions securely on Google Cloud.
Google has been working for years to help researchers leverage Google Cloud’s vast array of ML capabilities to advance biomedical analysis. Customers can benefit from Google’s expertise in ML applied to biology-related challenges, such as our variant caller, DeepVariant, our protein folding project, Alphafold, and the ever expanding repertoire of tools and workflows in Terra.
Scaling biomedical research with Terra and Google Cloud
We know the challenges researchers and clinicians around the world face when tackling big questions in biomedical research. That's why we build platforms that we want to use everyday, share them with the world, and help people use our platforms to drive science forward. Terra and Google Cloud are two of these platforms. Both were built to support a global user base that needs to collaborate and analyze massive scale data. Together, Terra and Google Cloud democratize access to leading edge biomedical research tools designed to accelerate discovery, development, and delivery of new healthcare solutions around the world. We would be excited to answer your questions and provide a demo of the Terra platform. Contact us at terra-bd@google.com.