IntegraGen: Helping to uncover the secrets of the human genome with Google Cloud

About IntegraGen

IntegraGen specializes in decoding the human genome and producing relevant and easily interpretable data for academic and private laboratories.

Industries: Healthcare, Life Sciences
Location: France

Tell us your challenge. We're here to help.

Contact us

About SFEIR

Shaped by and for talented developers, SFEIR helps its customers take on the most ambitious technical challenges and create state-of-the-art applications.

With Google Cloud, IntegraGen created scalable, high-performance analytics tools that enable researchers and oncologists to rapidly interpret genomic data.

Google Cloud results

  • Provides researchers and clinicians with high-speed analysis and insight from complex and diverse datasets
  • Minimizes infrastructure costs with auto scaling and preemptible VMs
  • Enables faster innovation and development with no physical limits on hardware capacity

Cuts processing time from sixteen hours to eight hours

The science of the human genome is young and constantly evolving. As genomic sequencing techniques become more powerful, interpreting the large amounts of data generated has become more of a challenge. With access to cutting-edge technology and a wealth of expertise, IntegraGen provides researchers and clinicians with the means to analyze and interpret genomic material quickly enough to be used for patients in medical clinics.

“Our challenge is to perform increasingly large scale and complex interpretations of genomic sequencing data in a timeframe compatible with clinical decision making. With Google we are able to scale smarter, innovate faster, and keep our data more secure.”

Bérengère Génin, Director of Bioinformatics, IntegraGen

In 2017, when it was developing a new data analytics tool, IntegraGen realized that its existing on-premises infrastructure was reaching the limits of its capabilities in terms of scale and performance. To go further and faster, the company chose to develop its new tools with Google Cloud.

“Our challenge is to perform increasingly large scale and complex interpretations of genomic sequencing data in a timeframe compatible with clinical decision making,” says Bérengère Génin, Director of Bioinformatics at IntegraGen. “With Google we are able to scale smarter, innovate faster, and keep our data more secure.”

Boosting performance, minimizing costs

Medical clinics need accurate information as quickly as possible. IntegraGen is constantly updating and improving its bioinformatic and statistical analyses to provide the best possible insight in the shortest timeframe to oncologists and their patients. In early 2017, the company began work on a new Software as a Service (SaaS) tool called MERCURY. It would allow users to quickly interpret data about the variants in cancer cells and help clinicians to make better, quicker decisions. At the time, the company was looking for a way to expand beyond its existing on-premises bioinformatics infrastructure, and MERCURY provided the opportunity to explore cloud-based technology.

After examining a range of options, IntegraGen chose Google Cloud in September, for its ease of use and competitive pricing. The company teamed up with Google Cloud partner SFEIR to design and implement the new architecture, beginning with a proof of concept of the data pipeline that would feed into the MERCURY platform. After that proved successful, at the start of 2018, SFEIR and IntegraGen began work on the new infrastructure.

“With the cluster auto-scaler feature available in Google Kubernetes Engine, we are able to automatically adapt the resources used and billed to the workload. There’s virtually no limit to the number of VMs we can use at a time, so we can now process whole batches of analyses in parallel.”

Pascal Castéran, Data Engineer and Google Cloud Consultant, SFEIR

For Pascal Castéran, Data Engineer and Google Cloud Consultant at SFEIR, Google Kubernetes Engine was the backbone of the new system. Using Google Kubernetes Engine clusters allowed SFEIR and IntegraGen to create an auto-scaling infrastructure that added compute power in seconds when it was needed, then reduced it when demand flattened. IntegraGen used BigQuery and Cloud Dataflow for processing the diverse and complex datasets that would feed into the MERCURY service. For a boost in performance, IntegraGen also opted to use local SSDs. Meanwhile, the use of preemptible VMs in Google Kubernetes Engine and other innovative features meant that the company could minimize costs even further.

“With the cluster auto-scaler feature available in Google Kubernetes Engine, we are able to automatically adapt the resources used and billed to the workload,” says Pascal. “There’s virtually no limit to the number of VMs we can use at a time, so we can now process whole batches of analyses in parallel.”

High scalability, low maintenance

With Google Cloud at its core, IntegraGen launched MERCURY in Europe as a highly scalable, customizable analytics tool enabling clinicians to harness the compute power of thousands of machines from a simple, easy-to-use interface.

“With Google Cloud, there’s no need to buy lots of new hardware or worry about underutilized infrastructure when we develop a new product. It gives us agility, lower risk and, because we only pay for what we use, lower operational costs.”

Bérengère Génin, Director of Bioinformatics, IntegraGen

Without any physical infrastructure to maintain and with smart use of Google’s innovative pricing features, IntegraGen has kept costs much lower than it would have done with an entirely on-premises solution. While it is hard to quantify exactly how much Google Cloud has saved in terms of cost, what is clear, according to Bérengère, is how much time the company has saved in its work. “Before it would have taken us more than sixteen hours to process this kind of data,” she says. “With Google, we can do it in just eight hours.”

With a new cloud-based solution, IntegraGen’s new projects are no longer limited by the physical capacity or maintenance of its existing on-premises architecture. Instead, the company can now run projects in parallel, innovating and improving faster than ever before. Meanwhile, the IT department can focus on developing new and better tools for IntegraGen’s clients, instead of worrying about infrastructure.

“With Google Cloud, there’s no need to buy lots of new hardware or worry about underutilized infrastructure when we develop a new product,” says Bérengère. “It gives us agility, lower risk and, because we only pay for what we use, lower operational costs.”

Since launching MERCURY in Europe, IntegraGen has kept busy. In mid-2018, the company rolled out MERCURY in the United States with ease, thanks to Google’s global infrastructure. IntegraGen continues to work closely with Google to conform to the strictest data privacy and security standards in Europe and compliance with HIPAA in the United States. It has since launched a second solution called SIRIUS built with Google Cloud. SIRIUS assists researchers to quickly and intuitively analyze exome data for Mendelian and oncology applications. IntegraGen and SFEIR are now working to add more features to MERCURY, and exploring the capabilities of Google’s range of big data and machine learning tools.

“We still experiment with different tools from Google Cloud because it’s very open,” says Bérengère. “There are lots of applications that we think can help us grow.”

Tell us your challenge. We're here to help.

Contact us

About IntegraGen

IntegraGen specializes in decoding the human genome and producing relevant and easily interpretable data for academic and private laboratories.

Industries: Healthcare, Life Sciences
Location: France

About SFEIR

Shaped by and for talented developers, SFEIR helps its customers take on the most ambitious technical challenges and create state-of-the-art applications.